LB層到Real Server之間訪問請求的響應時間及HTTP狀態碼監控及報警設定

散盡浮華發表於2018-02-01

 

為了監控到各業務的訪問質量,基於LB層的Nginx日誌,實現LB層到Real Server之間訪問請求的響應時間(即upstream_response_time)及HTTP狀態碼(即upstream_status)的監控及報警。操作記錄如下:

基本資訊:
負載均衡採用的是Nginx+Keeplived
負載域名:bs7001.kevin-inc.com (有很多負載域名,這裡用該域名作為示例)
日誌:bs7001.kevin-inc.com-access.log

1)LB層Nginx的log_format日誌格式的設定(可以參考:http://www.cnblogs.com/kevingrace/p/5893499.html)
[root@inner-lb01 ~]# cat /data/nginx/conf/nginx.conf
......
######
    ## set access log format
    ######
    log_format  main  '$remote_addr $remote_user [$time_local] "$request" '
                  '$status $body_bytes_sent "$http_referer" '
                  '$http_user_agent $http_x_forwarded_for $request_time $upstream_response_time $upstream_addr $upstream_status';
 
    #######
.....

2)監控及報警指令碼設定
日誌路徑
[root@inner-lb01 ~]# ll /data/nginx/logs/bs7001.kevin-inc.com-access.log 
-rw-r--r-- 1 root root 0 12月 13 17:00 /data/nginx/logs/bs7001.kevin-inc.com-access.log

sendemail安裝配置(安裝可參考:http://www.cnblogs.com/kevingrace/p/5961861.html)
[root@inner-lb01 ~]# cat /opt/sendemail.sh        //該指令碼可直接拿過來使用
#!/bin/bash
# Filename: SendEmail.sh
# Notes: 使用sendEmail
#
# 指令碼的日誌檔案
LOGFILE="/tmp/Email.log"
:>"$LOGFILE"
exec 1>"$LOGFILE"
exec 2>&1
SMTP_server='smtp.kevin.com'
username='notice@kevin.com'
password='notice@123'
from_email_address='notice@kevin.com'
to_email_address="$1"
message_subject_utf8="$2"
message_body_utf8="$3"
# 轉換郵件標題為GB2312,解決郵件標題含有中文,收到郵件顯示亂碼的問題。
message_subject_gb2312=`iconv -t GB2312 -f UTF-8 << EOF
$message_subject_utf8
EOF`
[ $? -eq 0 ] && message_subject="$message_subject_gb2312" || message_subject="$message_subject_utf8"
# 轉換郵件內容為GB2312,解決收到郵件內容亂碼
message_body_gb2312=`iconv -t GB2312 -f UTF-8 << EOF
$message_body_utf8
EOF`
[ $? -eq 0 ] && message_body="$message_body_gb2312" || message_body="$message_body_utf8"
# 傳送郵件
sendEmail='/usr/local/bin/sendEmail'
set -x
$sendEmail -s "$SMTP_server" -xu "$username" -xp "$password" -f "$from_email_address" -t "$to_email_address" -u "$message_subject" -m "$message_body" -o message-content-type=text -o message-charset=gb2312


[root@inner-lb01 ~]# cd /opt/lb_log_monit.sh/
[root@inner-lb01 lb_log_monit.sh]# ll
總用量 12
-rwxr-xr-x 1 root root 1180 2月   1 13:03 bs7001_request_status_monit.sh
-rwxr-xr-x 1 root root  821 2月   1 11:20 bs7001_request_time_monit_request.sh
-rwxr-xr-x 1 root root  559 2月   1 13:01 bs7001_request_time_monit.sh


訪問請求的響應時間監控報警指令碼(下面指令碼中取日誌檔案中的第3、10列以及倒數第1、2、3列)
[root@inner-lb01 lb_log_monit.sh]# cat bs7001_request_time_monit.sh 
#!/bin/bash
/usr/bin/tail -1000 /data/nginx/logs/bs7001.kevin-inc.com-access.log|awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}' > /root/lb_log_check/bs7001.kevin-inc.com-check.log

for i in `awk '{print $3}' /root/lb_log_check/bs7001.kevin-inc.com-check.log`
do
  a=$(printf "%f" `echo ${i}*1000|bc`|awk -F"." '{print $1}')
  b=$(printf "%f" `echo 1*1000|bc`|awk -F"." '{print $1}')

  if [ $a -ge $b ];then
     cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep $i 
  else
     echo "it is ok" >/dev/null 2>&1
  fi
done

[root@inner-lb01 lb_log_monit.sh]# cat bs7001_request_time_monit_request.sh 
#!/bin.bash
/bin/bash -x /opt/lb_log_monit.sh/bs7001_request_time_monit.sh > /root/lb_log_check/bs7001.kevin-inc.com_request_time.log

NUM=`cat /root/lb_log_check/bs7001.kevin-inc.com_request_time.log|wc -l`
if [ $NUM != 0 ];then
  /bin/bash /opt/sendemail.sh wangshibo@kevin.com "從LB層訪問bs7001.kevin-inc.com請求的響應時間" "響應時間已超過1秒鐘!\n具體情況如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com_request_time.log`"
  /bin/bash /opt/sendemail.sh linan@kevin.com "從LB層訪問bs7001.kevin-inc.com請求的響應時間" "響應時間已超過1秒鐘!\n具體情況如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com_request_time.log`"
else
  echo "從LB層訪問bs7001.kevin-inc.com請求的響應正常"
fi

[root@inner-lb01 lb_log_monit.sh]# ll /root/lb_log_check/
總用量 152
-rw-r--r-- 1 root root 147766 2月   1 15:00 bs7001.kevin-inc.com-check.log
-rw-r--r-- 1 root root    216 2月   1 15:00 bs7001.kevin-inc.com_request_time.log


訪問的HTTP狀態碼監控報警指令碼(500,502,503,504的狀態碼進行報警)
[root@inner-lb01 lb_log_monit.sh]# cat bs7001_request_status_monit.sh 
#!/bin/bash
/usr/bin/tail -1000 /data/nginx/logs/bs7001.kevin-inc.com-access.log|awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}' > /root/lb_log_check/bs7001.kevin-inc.com-check.log

for i in `awk '{print $5}' /root/lb_log_check/bs7001.kevin-inc.com-check.log|sort|uniq`
do
  if [ ${i} = 500  ];then
    /bin/bash /opt/sendemail.sh wangshibo@kevin.com "從LB層訪問bs7001.kevin-inc.com請求的HTTP狀態返回碼" "HTTP狀態返回碼:500\n具體情況如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`"
  elif [ ${i} = 502  ];then
    /bin/bash /opt/sendemail.sh wangshibo@kevin.com "從LB層訪問bs7001.kevin-inc.com請求的HTTP狀態返回碼" "HTTP狀態返回碼:502\n具體情況如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`"
  elif [ ${i} = 503  ];then
    /bin/bash /opt/sendemail.sh wangshibo@kevin.com "從LB層訪問bs7001.kevin-inc.com請求的HTTP狀態返回碼" "HTTP狀態返回碼:503\n具體情況如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`"
  else
     echo "it is ok"
  fi
done

3)結合crontab進行定時監控
[root@inner-lb01 lb_log_monit.sh]# crontab -l
#LB到後端伺服器之間訪問各系統業務的請求響應時間和http狀態碼監控
*/2 * * * * /bin/bash -x /opt/lb_log_monit.sh/bs7001_request_time_monit_request.sh >/dev/null 2>&1
*/2 * * * * /bin/bash -x /opt/lb_log_monit.sh/bs7001_request_status_monit.sh >/dev/null 2>&1

取對應log檔案中的第3、10以及倒數第1、2、3列內容
[root@inner-lb01 lb_log_monit.sh]# /usr/bin/tail -10 /data/nginx/logs/bs7001.kevin-inc.com-access.log|awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}'
[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304
[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304
[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304
[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304
[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304
[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304
[01/Feb/2018:15:06:02 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.006 192.168.1.21:7001 200
[01/Feb/2018:15:07:12 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.003 192.168.1.22:7001 200
[01/Feb/2018:15:07:51 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.003 192.168.1.21:7001 200
[01/Feb/2018:15:07:57 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.007 192.168.1.22:7001 200

相關文章