Nagios監控lvs服務

yuntui發表於2016-11-03

 1在lvs 伺服器上安裝nrpe客戶端:

1.1rpm方式安裝nrpe客戶端

下載地址:


  1. [root@localhost nagios]# ll
  2. 總計 768
  3. -rw-r--r-- 1 root root 713389 12-16 12:08 nagios-plugins-1.4.11-1.x86_64.rpm
  4. -rw-r--r-- 1 root root 32706 12-16 12:09 nrpe-2.12-1.x86_64.rpm
  5. -rw-r--r-- 1 root root 18997 12-16 12:08 nrpe-plugin-2.12-1.x86_64.rpm
  6. [root@localhost nagios]# rpm -ivh *.rpm --nodeps --force

1.2 在配置檔案最末尾,新增配置資訊以及監控主機伺服器ip地址


  1. [root@ localhost nagios]# vim /etc/nagios/nrpe.cfg
  2. # add by tim on 2014-06-11
  3. command[check_users]=/usr/local/nagios/libexec/check_users -w 8 -c 15
  4. command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
  5. command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda
  6. command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
  7. #command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 50 -c 80
  8. command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 750 -c 800
  9. command[check-host-alive]=/usr/local/nagios/libexec/check_ping -H localhost -w 3000.0,80% -c 5000.0,100% -p 5
  10. allowed_hosts = 127.0.0.1, 10.2xx.3.xx

check下命令是否生效:


  1. [root@web-9 nrpe-2.15]# /usr/local/nagios/libexec/check_users -w 8 -c 15
  2. USERS OK - 2 users currently logged in |users=2;8;15;0
  3. [root@web-9 nrpe-2.15]#

看到已經USERS OK -….命令已經生效。

 

1.3 啟動nrpe報錯如下:


  1. [root@web-9 ~]# service nrpe restart
  2. Shutting down nrpe: [失敗]
  3. Starting nrpe: /usr/sbin/nrpe: error while loading shared libraries: libssl.so.6: cannot open shared object file: No such file or directory
  4.                                                            [失敗]
  5. [root@web-9 ~]#
  6. [root@db-m2-slave-1 nagios_client]# service nrpe start
  7. Starting nrpe: /usr/sbin/nrpe: error while loading shared libraries: libssl.so.6: cannot open shared object file: No such file or directory
  8.                                                            [失敗]
  9. [root@db-m2-slave-1 nagios_client]#

建立連線

[root@db-m2-slave-1 nagios_client]# ln -s /usr/lib64/libssl.so /usr/lib64/libssl.so.6

 (如果沒有libssl.so,就採用別的libssl.so.10來做軟連線,ln -s /usr/lib64/libssl.so.10 /usr/lib64/libssl.so.6)

[root@db-m2-slave-1 nagios_client]#

再重新啟動如下:


  1. [root@db-m2-slave-1 nagios_client]# service nrpe start
  2. Starting nrpe: /usr/sbin/nrpe: error while loading shared libraries: libcrypto.so.6: cannot open shared object file: No such file or directory
  3.                                                            [失敗]
  4. [root@web-10 ~]# ll /usr/lib64/libcrypto.so
  5. lrwxrwxrwx. 1 root root 18 10月 13 2013 /usr/lib64/libcrypto.so -> libcrypto.so.1.0.0
  6. [root@db-m2-slave-1 nagios_client]#

再建連結:

  1. [root@db-m2-slave-1 nagios_client]# ln -s /usr/lib64/libcrypto.so /usr/lib64/libcrypto.so.6
  2. (或者如果沒有libcrypto.so,就採用libcrypto.so.10做軟連線, ln -s /usr/lib64/libcrypto.so.10 /usr/lib64/libcrypto.so.6)
  3. [root@db-m2-slave-1 nagios_client]# service nrpe start
  4. Starting nrpe: [確定]
  5. [root@db-m2-slave-1 nagios_client]#

1.4 檢測下nrpe是否正常執行:


  1. 去nagios伺服器端check下
  2. [root@cache-2 ~]# /usr/local/nagios/libexec/check_nrpe -H xx.xx3.xx
  3. NRPE v2.12
  4. [root@cache-2 ~]#

看到返回NRPE v2.15表示已經連線成功。

 

2 編寫shell指令碼實現lvs監控

2.1 監控指令碼
Nagios裡面沒有現成的監控lvs的狀態指令碼,所以需要去網上找一個簡單的監控指令碼check_lvs.shcopy/usr/lib/nagios/plugins/目錄,賦予nagios許可權,指令碼內容如下:

  1. #!/bin/bash
  2. # http://www.ohlinux.com/archives/632/
  3. # add by tim on 20140613
  4. USAGE_Method=\"$(basename $0)[-h|--hostname] <Free ip or hostname> [-w|--warning] <Free integer> [-c|--critical] <Free integer>\"
  5. USAGE_Value=\"warning value must be small than critical value: `basename $0` $*\"
  6. STATE_OK=0
  7. STATE_WARNING=1
  8. STATE_CRITICAL=2
  9. STATE_UNKNOWN=3

  10. if [ $# -lt 4 ];then
  11.     echo
  12.     echo \"Usage: $USAGE_Method\"
  13.     echo
  14.     exit 0
  15. fi
  16. while [ $# -gt 0 ];
  17. do
  18.     case \"$1\" in
  19.     -w|--warning)
  20.     shift
  21.     warning=$1
  22.     ;;
  23.     -c|--critical)
  24.     shift
  25.     critical=$1
  26.     ;;
  27.     esac
  28.     shift
  29. done


  30. if [[ $warning == $critical || $warning -gt $critical ]]
  31. then
  32.     #echo $warning
  33.     #echo $critical
  34.     echo \"$USAGE_Value\"
  35.     echo \"Usage: $USAGE_Method\"
  36.     exit 0
  37. fi


  38. ACT_COUNT=0
  39. Inactive_count=0
  40. stat1=`sudo ipvsadm | grep http | grep Route|wc -l`
  41. if [ $stat1 -ne 0 ];then
  42.     for NUM in `sudo ipvsadm | grep http | grep Route | awk \'{print $5}\'`
  43.     do
  44.          ACT_COUNT=$(($ACT_COUNT+ $NUM))
  45.     done
  46.     for NUM in `sudo ipvsadm | grep http | grep Route | awk \'{print $6}\'`
  47.     do
  48.         Inactive_count=$(($Inactive_count+ $NUM))
  49.     done
  50. else
  51.     echo \" stat1:$stat1, lvs critical,lvs is down now.\"
  52.     exit 3
  53. fi



  54. if [[ \"$ACT_COUNT\" -gt \"$critical\" ]]
  55. then
  56.     echo \"critical - lvs connetion is : $ACT_COUNT active\"
  57.     exit 2
  58. fi
  59. if [[ \"$ACT_COUNT\" -gt \"$warning\" && \"$ACT_COUNT\" -lt \"$critical\" ]]
  60. then
  61.     echo \"warning - lvs connetions is : $ACT_COUNT active\"
  62.     exit 1
  63. fi
  64. if [[ \"$ACT_COUNT\" -lt \"$warning\" || $ACT_COUNT == 0 ]]
  65. then
  66.     echo \"LVS OK - LVS is running (conn: $ACT_COUNT active, $Inactive_count inactive)|active=$ACT_COUNT;69999;99999;0; inactive=$Inactive_count;69999;99999;0;\"
  67.     exit 0
  68. fi


2.2 nrpe.cfg裡面配置如下

Vim /etc/nagios/nrpe.cfg,在裡面新增一行check_lvs命令:

command[check_lvs]=/usr/lib/nagios/plugins/check_lvs -w 300 -c 600 

之後重啟nrpe


點選(此處)摺疊或開啟

  1. [root@/root/nagios/check_lvs ~]# service nrpe restart;
  2. Shutting down nrpe: [確定]
  3. Starting nrpe: [確定]
  4. [root@/root/nagios/check_lvs ~]#

2.3 nagios服務端check一下


  1. [root@cache-2 ~]# /usr/local/nagios/libexec/check_nrpe -H 1x.xx4.x.x5 -c check_lvs
  2.  lvs critical,lvs is down now.
  3. [root@cache-2 ~]#

看到check出來lvs服務已經處於down模式。

說明:由於check_lvs是要呼叫ipvsadm命令來獲取LVS狀態的,而ipvsadm命令是隻能以root使用者來執行的, 所以需要將nagios使用者設定成可以無需密碼直接su成root,這樣就能以nagios使用者執行命令sudo /usr/lib/nagios/plugins/check_lvs 。在centos系統中,無法直接呼叫sudo命令,需要修改/etc/sudoers, 找到 #Defaults requiretty 並取消註釋,另外新增一行。表示nagios使用者不需要登陸終端就可以呼叫命令,如下所示:

Defaults    requiretty

Defaults:nagios    !requiretty

#新增nagios 請求sudo,允許特定指令時(可跟引數),不需要密碼(如)。

nagios ALL=(ALL) NOPASSWD: ALL

 

 再去nagios伺服器上面check下:

  1. [root@cache-2 etc]# /usr/local/nagios/libexec/check_nrpe -H 10.xx.xx.xx -c check_lvs
  2. LVS OK - LVS is running (conn: 16 active, 77 inactive)|active=16;69999;99999;0; inactive=77;69999;99999;0;
  3. [root@cache-2 etc]#
已經能成功check了。

 

2.4 nagios伺服器上新增配置

  1. vim services.cfg
  2. define service{
  3.         host_name lvs-lan
  4.         service_description Check lvs
  5.         check_command check_nrpe!check_lvs
  6.         max_check_attempts 5
  7.         normal_check_interval 3
  8.         retry_check_interval 2
  9.         check_period 24x7
  10.         notification_interval 10
  11.         notification_period 24x7
  12.         notification_options w,c,r
  13.         contact_groups opsweb
  14.         }
  15. vim objects/commands.cfg
  16. define command{
  17.         command_name check_lvs
  18.         command_line $USER1$/check_lvs -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$
  19.         }

之後重新載入nagios既完成了對lvs的監控服務。

  1. [root@cache-2 etc]# service nagios reload
  2. Running configuration check...
  3. Reloading nagios configuration...
  4. done
  5. [root@cache-2 etc]#

 至此,nagios下面對lvs服務的監控已經完成。

 

參考資料:http://c20031776.blog.163.com/blog/static/684716252013627506890/

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30633755/viewspace-2127703/,如需轉載,請註明出處,否則將追究法律責任。

相關文章