Nagios的安裝步驟

lsm_3036發表於2011-06-15

一.nagios在伺服器端(監控端)的安裝。伺服器IP地址:192.168.0.13

1.在安裝之前首先檢測系統是否安裝以下包:httpd php gcc glibc glibc-common gd gd-devel

#rpm -qa | grep httpd
#rpm -qa | grep php
....
#rpm -qa | grep gd

2.建立使用者

#useradd nagios
#groupadd nagcmd
#/usr/sbin/usermod -a -G nagcmd nagios
#/usr/sbin/usermod -a -G nagcmd apache

3.安裝nagios包(此處用3.2.0版本)

#tar zxvf nagios-3.2.0.tar.gz
#cd nagios-3.2.0
#./configure --prefix=/usr/local/nagios --with-command-group=nagcmd
#make
#make install
#make install-init
#make install-config
#make install-commandmode

#make install-webconf 

4.建立管理使用者並啟動apache

#htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
passwd:******
此處所建立使用者為nagiosadmin,如果為其他使用者剛後面要修改檔案:/usr/local/nagios/etc/cgi.cfg,後面再講。

#service httpd restart

5.安裝nagios-plugins(此處用1.4.13版本)

#tar zxvf nagios-plugins-1.4.13.tar.gz
#cd nagios-plugins-1.4.13
#./configure --with-nagios-user=nagios --with-nagios-group=nagios --prefix=/usr/local/nagios
#make
#make install

6.註冊服務,設定開機啟動

#chkconfig --add nagios
#chkconfig nagios on

7.此時完成初步安裝,可以監控檢視本機的一些服務,檢測配置檔案並啟動nagios

#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

出現此處,表明,配置檔案沒有錯誤,可以啟動nagios

#service nagios start

8.登入檢視
http://192.168.0.13/nagios/
輸入建立的使用者名稱nagiosadmin與設定密碼,進去可操作。


########################################################################
 
此時只能說完成了最其他的操作,最重要的是配置,我們通過自己的配置可以達到監控自己想要監控的主機服務的目的。我們安裝nagios的目

的肯定不僅僅只為了監控一臺伺服器,而是要監控一個服務系統群組,這裡就要用到一個軟體nrpe,此軟體在監控端和被監控端都要安裝才行

,預設使用埠為5666.

########################################################################
 
二.nagios配置

1.在伺服器端安裝nrpe(此處使用2.12版本)

#tar zxvf nrpe-2.12.tar.gz
#cd nrpe-2.12
#./configure     (因為之前安裝了nagios-plugins,所以nrpe預設安裝在/usr/local/nagios/下,也就是也nagios-plugins在同一個安裝目

錄下)
#make all
#make install-plugin
#make install-daemon
#make install-daemon-config

# ls /usr/local/nagios/libexec/check_nrpe
/usr/local/nagios/libexec/check_nrpe    
此檔案出現,表明安裝成功

# ll /usr/local/nagios/
total 24
drwxrwxr-x  2 nagios nagios 4096 Jul 21 19:09 bin
drwxrwxr-x  3 nagios nagios 4096 Jul 22 13:35 etc
drwxrwxr-x  2 nagios nagios 4096 Jul 21 19:09 libexec
drwxrwxr-x  2 nagios nagios 4096 Jul 21 18:57 sbin
drwxrwxr-x 10 nagios nagios 4096 Jul 21 19:03 share
drwxrwxr-x  5 nagios nagios 4096 Jul 22 14:25 var

注意此時,在nagios目錄下的所有檔案與子目錄所有者與所屬組都為nagios,但是一個除外,/usr/local/nagios/etc/htpasswd.usrs為 root

root,以後再新增的檔案也同樣為nagios nagios,這裡如果出現差錯,後面可能會出許可權問題。

2.配置nagios主配置檔案nagios.cfg
#  cat nagios.cfg  只寫出改動檔案,下同

cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg

新新增下面4句,指向子檔案所在位置
cfg_file=/usr/local/nagios/etc/hosts.cfg
cfg_file=/usr/local/nagios/etc/hostgroups.cfg
cfg_file=/usr/local/nagios/etc/contactgroups.cfg
cfg_file=/usr/local/nagios/etc/services.cfg


# Definitions for monitoring the local (Linux) host
#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg  #註釋掉,因為有了hosts.cfg檔案

command_check_interval=10s
#command_check_interval=-1  #原來為-1,改成10s


3.由上一步新新增的4句,建立檔案hosts.cfg hostgroup.cfg contactgroups.cfg services.cfg

4.配置hosts.cfg    hostgroup.cfg   contactgroups.cfg

# cat hosts.cfg

define host {
host_name               nagios-server    #與hostgroup.cfg定義的保持一致
alias                   nagios server
address                 192.168.0.13     #被監控主機IP
contact_groups          sagroup          #監控使用者所在的組名,在contactgroups.cfg定義
check_command           check-host-alive  #此為一個命令,在objects/commands.cfg中有定義,必須有定義
max_check_attempts      5           #檢測次數,一般為3-5次
notification_interval   10    #檢測時間間隔,單位為分鐘,根據自己情況定
notification_period     24x7              #代表不間斷的檢測,不能為*,只能為x,下同
notification_options    d,u,r           #此為狀態描述,d-down,u-unreacheable,r-recovery
}

----------------------------------------------------
# cat hostgroup.cfg 定義組與組成員

define hostgroup {
hostgroup_name  sa-servers
alias           sa servers
members         nagios-server     #(如果有多使用者,可以以“,”分隔,不能有空格)
}

----------------------------------------------------

# cat contactgroups.cfg

define contactgroup {
contactgroup_name       sagroup
alias                   system administrator group
members                 nagiosadmin
}

--------------------

5.配置cgi.cfg

# cat cgi.cfg
use_authentication=0    #改成0表示不對使用者進行cgi驗證

authorized_for_system_information=nagiosadmin    #因為當時建立的管理使用者就是nagiosadmin,所以此處不用修改,如果建立使用者為其他

,則要修改,如果建立多個使用者,可以用“,”分隔。
authorized_for_configuration_information=nagiosadmin
authorized_for_system_commands=nagiosadmin   #  * 此處即使是其他使用者,也不能改動。*
authorized_for_all_services=nagiosadmin
authorized_for_all_hosts=nagiosadmin
authorized_for_all_service_commands=nagiosadmin
authorized_for_all_host_commands=nagiosadmin


6.配置nrpe.cfg

# cat nrpe.cfg | sed -n '/^[^#]/p'

log_facility=daemon
pid_file=/var/run/nrpe.pid
server_port=5666      #埠號,可以改動
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=127.0.0.1,192.168.0.13   #此處是可以連線管理此主機的伺服器,也就是監控伺服器的IP
 
dont_blame_nrpe=0
debug=0
command_timeout=60
connection_timeout=300
#下面是定義的命令
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10    #連線使用者數,超過5個warning,10個Cirtical(嚴重)
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20  #負載情況,三個數表示,當前,5分鐘內,15分

鍾內
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z  #使用記憶體
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200  #總記憶體
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%  #交換分割槽使用率
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda3  #磁碟分割槽使用率

 

還可以自己定義,通過寫指令碼來完成,後面再來補充。

7.配置objects/contacts.cfg

# cat objects/contacts.cfg

define contact{
contact_name                    nagiosadmin
alias                           system administrator
service_notification_period     24x7
host_notification_period        24x7
service_notification_options    w,u,c,r                  #代表Warning,Unknown,Critical,recovery
host_notification_options       d,u,r
service_notification_commands   notify-service-by-fetion,notify-service-by-sms   #指明報警方式
host_notification_commands      notify-host-by-fetion,notify-host-by-sms         #同上
email    **********@139.com
pager                           152******13
}


8.配置 objects/commands.cfg

# cat objects/commands.cfg  (一定要定義的列出,其他的不必要變動)

# 'check-host-alive' define command

define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
        }
# 'check_nrpe' define command  這個是要自己定義的,很重要,會影響到services.cfg中的配置

define command{
       command_name check_nrpe
       command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$    # $ARG1$表示check_nrpe後面的命令,如:check_disk
       }


# 'notify-host-by-fetion' command definition   飛信報警配置

define command{
        command_name    notify-host-by-fetion
        command_line    /usr/local/fetion/fetion --mobile=152******** --pwd=******** --to $CONTACTPAGER$ --msg-utf8="$HOSTNAME$ is$HOSTSTATE$" --debug
}

# 'notify-service-by-email' command definition
define command{
        command_name    notify-service-by-fetion
        command_line    /usr/local/fetion/fetion --mobile=152******** --pwd=******** --to $CONTACTPAGER$ --msg-

utf8="$NOTIFICATIONTYPE$: $HOSTALIAS$/$SERVICEDESC$ IS $SERVICESTATE$" --debug
        }


# 'notify-host-by-sms' command definition      郵件報警配置

define command {
       command_name notify-host-by-sms
       command_line  /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" |/usr/local/sendEmail/sendEmail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
        }

# 'notify-service-by-sms' command definition

define command {
       command_name notify-service-bysms
       command_line  /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService:

$SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional

Info:\n\n$SERVICEOUTPUT$" | /usr/local/sendEmail/sendEmail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/

$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
       }

9.配置services.cfg

#cat services.cfg

###nagios-server:services.cfg###

define service {
host_name               nagios-server     #主機名一定要與hosts.cfg檔案中的定義保持一致
service_description     check-host-alive
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check-host-alive  #命令為objects/commands.cfg中已經定義的
}


define service {
host_name               nagios-server
service_description     check_tcp 80
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_tcp!80   #感嘆號後面為引數
}

 

define service {
host_name               nagios-server
service_description     check_local_disk
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
#check_command          check_local_disk!20%!10%!/
check_command           check_nrpe!check_disk
}

 

define service {
host_name               nagios-server
service_description     check_load
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_load
}

define service {
host_name               nagios-server
service_description     check_total_procs
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_total_procs
}

define service {
host_name               nagios-server
service_description     check_users
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_users
}


此處定義監控6個服務,如果要監控其他主機的服務,也要在這裡定義,下面會提到。

 

10.此時配置完成了一大步,以後再配置也是在這個基礎上,會很容易了。
下面就要啟動nrpe,重啟nagios來檢測配置是否成功!

#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

出現此處,表明,配置檔案沒有錯誤,可以啟動nagios

#service nagios restart  啟動成功


# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
# tail -f /var/log/messages
Jul 22 16:25:16 localhost nrpe[14911]: Starting up daemon
Jul 22 16:25:16 localhost nrpe[14911]: Listening for connections on port 5666
Jul 22 16:25:16 localhost nrpe[14911]: Allowing connections from: 127.0.0.1,192.168.0.13
日誌資訊出現如上,表明啟動成功,測試一下


# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.13
NRPE v2.12                     會顯示nrpe版本號

# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.13 -c check_disk
DISK OK - free space: / 242377 MB (87% inode=99%);| /=34099MB;233219;262371;0;291524

能出現這些資訊表明成功!
 
三.安裝配置被監控端 192.168.0.61 192.168.0.62 。。。

1.建立使用者nagios (在多臺主機上作同樣的配置,如果要監控其他服務,可以再作處理)

# useradd nagios

2.安裝nagios-plugins

# tar zxvf nagios-plugins-1.4.13.tar.gz
# cd nagios-plugins-1.4.13
# ./configure --prefix=/usr/local/nagios/
# make
# make install

# chown -R nagios.nagios /usr/local/nagios 

2.安裝nrpe,版本與監控端保持一致

# tar zxvf nrpe-2.12.tar.gz
# cd nrpe-2.12
# ./configure
# make all
# make install-plugin
# make install-daemon
# make install-daemon-config

# ll /usr/local/nagios/
total 16
drwxrwxr-x 2 nagios nagios 4096 Jul 21 11:30 bin
drwxrwxr-x 2 nagios nagios 4096 Jul 22 13:40 etc
drwxr-xr-x 2 nagios nagios 4096 Jul 21 11:20 libexec
drwxr-xr-x 3 root   root   4096 Jul 21 11:19 share

3.修改配置檔案nrpe.cfg
此檔案可以從監控端伺服器上覆制到這裡來,因為伺服器端都是配置好的檔案,我設定的完全一樣。

# scp 192.168.0.13:/usr/local/nagios/etc/nrpe.cfg /usr/local/nagios/etc/nrpe.cfg

#cat /usr/local/nagios/etc/nrpe.cfg | grep allowed_hosts
allowed_hosts=127.0.0.1,192.168.0.13      #此處為監控端伺服器IP地址

4.啟動客戶端nrpe

# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
# tail -f /var/log/messages
Jul 22 16:41:16 localhost nrpe[14911]: Starting up daemon
Jul 22 16:41:16 localhost nrpe[14911]: Listening for connections on port 5666
Jul 22 16:41:16 localhost nrpe[14911]: Allowing connections from: 127.0.0.1,192.168.0.13
日誌資訊出現如上,表明啟動成功,測試一下

去監控端上測試:
# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.61  一定是在監控端上測試的,而不是在剛安裝好的客戶端上,自己以前在這裡犯

過錯!!總是報ssl問題。
NRPE v2.12                     會顯示nrpe版本號

# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.61 -c check_load
OK - load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0;

再次成功!

四.去客戶端配置hosts.cfg hostgroups.cfg services.cfg來完成對伺服器群的監控

在192.168.0.13上

1.配置hosts.cfg

# cat hosts.cfg   增加機器

define host {
host_name               nagios-server
alias                   nagios server
address                 192.168.0.13
contact_groups          sagroup
check_command           check-host-alive
max_check_attempts      5
notification_interval   10
notification_period     24x7
notification_options    d,u,r
}

define host {
host_name               mysql-server-61
alias                   mysql server 61
address                 192.168.0.61
contact_groups          sagroup
check_command           check-host-alive
max_check_attempts      5
notification_interval   10
notification_period     24x7
notification_options    d,u,r
}

define host {
host_name               mysql-server-62
alias                   mysql server 62
address                 192.168.0.62
contact_groups          sagroup
check_command           check-host-alive
max_check_attempts      5
notification_interval   10
notification_period     24x7
notification_options    d,u,r
}

2.配置hostgroups.cfg

# cat hostgroups.cfg

define hostgroup {
hostgroup_name  sa-servers
alias           sa servers
members         nagios-server,mysql-server-61,mysql-server-62  #上面提到過這裡,把主機成員增加到這裡
}

3.配置 services.cfg

[root@localhost etc]# cat services.cfg

##### nagios-server:services.cfg ######

define service {
host_name               nagios-server
service_description     check-host-alive
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check-host-alive
}


define service {
host_name               nagios-server
service_description     check_tcp 80
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_tcp!80
}

 

define service {
host_name               nagios-server
service_description     check_local_disk
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
#check_command          check_local_disk!20%!10%!/
check_command           check_nrpe!check_disk
}

 

define service {
host_name               nagios-server
service_description     check_load
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_load
}

define service {
host_name               nagios-server
service_description     check_total_procs
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_total_procs
}

define service {
host_name               nagios-server
service_description     check_users
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_users
}

#### mysql-server-61:services.cfg ######

define service {
host_name               mysql-server-61
service_description     check_total_procs
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_total_procs
}


define service {
host_name               mysql-server-61
service_description     check_users
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_users
}

define service {
host_name               mysql-server-61
service_description     check_disk_/dev/sda3
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_disk
}

define service {
host_name               mysql-server-61
service_description     check_load
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_load
}

define service {
host_name               mysql-server-61
service_description     check_swap
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_swap
}

#### mysql-server-62:services.cfg #####

define service {
host_name               mysql-server-62
service_description     check-host-alive
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check-host-alive
}


define service {
host_name               mysql-server-62
service_description     check_users
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_users
}

define service {
host_name               mysql-server-62
service_description     check_disk_/dev/sda3
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_disk
}

4.完成配置,檢測配置。

#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

出現此處,表明,配置檔案沒有錯誤,可以啟動nagios

#service nagios restart  啟動成功

5.進入web監控介面

http://192.168.0.13/nagios/

大功告成!!
 

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/16978544/viewspace-698023/,如需轉載,請註明出處,否則將追究法律責任。

相關文章