Nagios監控系統搭建
1 環境設定
1.1 主機資訊
主機名 |
版本 |
IP |
安裝軟體 |
用途 |
linuxserver |
RHEL 6.5 |
192.168.230.136 |
nagios-4.0.8 nagios-plugins-2.0.3 nrpe-2.15 |
監控伺服器 |
linuxclient |
RHEL 6.5 |
192.168.230.137 |
nagios-plugins-2.0.3 nrpe-2.15 |
被監控客戶端 |
參照http://blog.itpub.net/28536251/viewspace-1444918/配置好YUM。
伺服器端網路配置:
[root@linuxserver ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
HWADDR=00:0c:29:68:ab:26
TYPE=Ethernet
ONBOOT=yes
NM_CONTROLLED=no
BOOTPROTO=none
IPADDR=192.168.230.136
NETMASK=255.255.255.0
GATEWAY=192.168.230.2
DNS1=192.168.230.2
[root@linuxserver ~]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=linuxserver
[root@linuxserver ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
192.168.230.136 linuxserver
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
客戶端網路配置:
[root@linuxclient ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
HWADDR=00:0c:29:ba:4f:56
TYPE=Ethernet
ONBOOT=yes
NM_CONTROLLED=no
BOOTPROTO=none
IPADDR=192.168.230.137
NETMASK=255.255.255.0
GATEWAY=192.168.230.2
DNS1=192.168.230.2
[root@linuxclient ~]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=linuxclient
[root@linuxclient ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
192.168.230.137 linuxclient
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
1.2 防火牆設定
為方便測試,暫時先關閉伺服器及客戶端的防火牆,可在系統搭建完成後,再開啟並設定防火牆策略。
伺服器端防火牆關閉:
[root@linuxserver ~]# /etc/init.d/iptables stop
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
[root@linuxserver ~]# /etc/init.d/iptables status
iptables: Firewall is not running.
[root@linuxserver ~]# chkconfig iptables off
[root@linuxserver ~]# chkconfig --list iptables
iptables 0:off 1:off 2:off 3:off 4:off 5:off 6:off
客戶端防火牆關閉:
[root@linuxclient ~]# /etc/init.d/iptables stop
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
[root@linuxclient ~]# /etc/init.d/iptables status
iptables: Firewall is not running.
[root@linuxclient ~]# chkconfig iptables off
[root@linuxclient ~]# chkconfig --list iptables
iptables 0:off 1:off 2:off 3:off 4:off 5:off 6:off
1.3 SELinux設定
為方便測試,暫時先禁用伺服器及客戶端的SELinux,可在系統搭建完成後,再開啟並設定SELinux策略。
伺服器端SELinux禁用:
使用vim編輯/etc/selinux/config,將SELINUX=enforcing修改為SELINUX=disabled,然後重啟。
[root@linuxserver ~]# cat /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
[root@linuxserver ~]# init 6
客戶端SELinux禁用:
[root@linuxclient ~]# cat /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
[root@linuxclient ~]# init 6
2 Nagios監控linuxserver的安裝配置
2.1 基礎支援套件安裝
[root@linuxserver ~]# yum install -y gcc glibc glibc-common gd gd-devel xinetd openssl-devel
2.2 安裝httpd和php
[root@linuxserver ~]# yum install -y httpd php
[root@linuxserver ~]# /etc/init.d/httpd start
Starting httpd: httpd: Could not reliably determine the server's fully qualified domain name, using 192.168.230.136 for ServerName
[ OK ]
根據上面的提示,修改httpd配置檔案,結果如下:
[root@linuxserver ~]# grep ServerName /etc/httpd/conf/httpd.conf | grep 80
ServerName linuxserver:80
重啟httpd正常。
[root@linuxserver ~]# /etc/init.d/httpd restart
Stopping httpd: [ OK ]
Starting httpd: [ OK ]
使用瀏覽器訪問伺服器地址http://192.168.230.136/,出現如下頁面說明httpd安裝ok。
輸入如下命令生成php測試頁。然後使用瀏覽器訪問http://192.168.230.136/phpinfo.php,出現如下頁面說明php安裝ok。
[root@linuxserver ~]# echo "" > /var/www/html/phpinfo.php
2.3 使用者及目錄設定
[root@linuxserver ~]# useradd -s /sbin/nologin nagios
[root@linuxserver ~]# mkdir /usr/local/nagios
[root@linuxserver ~]# chown -R nagios:nagios /usr/local/nagios
[root@linuxserver ~]# ll -d /usr/local/nagios
drwxr-xr-x 2 nagios nagios 4096 Mar 14 11:17 /usr/local/nagios
2.4 安裝Nagios
Nagios涉及的安裝包,可在及上下載。
[root@linuxserver ~]# tar -xvzf nagios-4.0.8.tar.gz
[root@linuxserver ~]# cd nagios-4.0.8
[root@linuxserver nagios-4.0.8]# ./configure --prefix=/usr/local/nagios/
[root@linuxserver nagios-4.0.8]# make all
[root@linuxserver nagios-4.0.8]# make install
# This installs the main program, CGIs, and HTML files
[root@linuxserver nagios-4.0.8]# make install-init
# This installs the init script in /etc/rc.d/init.d
[root@linuxserver nagios-4.0.8]# make install-commandmode
#This installs and configures permissions on the directory for holding the external command file
[root@linuxserver nagios-4.0.8]# make install-config
# This installs *SAMPLE* config files in /usr/local/nagios/etc
[root@linuxserver nagios-4.0.8]# make install-webconf
# This installs the Apache config file for the Nagios web interface
進入安裝目錄,如出現下表的6個目錄,則說明安裝ok。
[root@linuxserver nagios-4.0.8]# cd /usr/local/nagios/
[root@linuxserver nagios]# ls
bin etc libexec sbin share var
序號 |
目錄名稱 |
用途 |
1 |
bin |
可執行程式所在目錄 |
2 |
etc |
配置檔案所在目錄 |
3 |
libexec |
外部外掛所在目錄 |
4 |
sbin |
CGI 檔案所在目錄 |
5 |
share |
網頁檔案所在的目錄 |
6 |
var |
日誌檔案所在的目錄 |
新增nagios服務
[root@linuxserver ~]# chkconfig --add nagios
[root@linuxserver ~]# chkconfig nagios on
[root@linuxserver ~]# chkconfig --list nagios
nagios 0:off 1:off 2:on 3:on 4:on 5:on 6:off
由於/etc/httpd/conf.d/nagios.conf定義了訪問認證檔案,故需要建立訪問認證檔案及使用者名稱和密碼,以便透過web訪問nagios進行身份驗證,使用者名稱建議採用nagiosadmin,原因後面再講。
[root@linuxserver ~]# cat /etc/httpd/conf.d/nagios.conf
# SAMPLE CONFIG SNIPPETS FOR APACHE WEB SERVER
#
# This file contains examples of entries that need
# to be incorporated into your Apache web server
# configuration file. Customize the paths, etc. as
# needed to fit your system.
ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin"
# SSLRequireSSL
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
# Order deny,allow
# Deny from all
# Allow from 127.0.0.1
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user
Alias /nagios "/usr/local/nagios/share"
# SSLRequireSSL
Options None
AllowOverride None
Order allow,deny
Allow from all
# Order deny,allow
# Deny from all
# Allow from 127.0.0.1
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user
[root@linuxserver ~]# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
New password:
Re-type new password:
Adding password for user nagiosadmin
啟動nagios:
[root@linuxserver ~]# /etc/init.d/nagios start
Starting nagios: done.
重啟httpd:
[root@linuxserver ~]# /etc/init.d/httpd restart
Stopping httpd: [ OK ]
Starting httpd: [ OK ]
使用瀏覽器訪問http://192.168.230.136/nagios/,輸入使用者名稱和密碼,出現如下介面,說明nagios安裝ok。
2.5 配置nagios
Nagios的配置檔案位於/usr/local/nagios/etc/,各檔案具體用途如下表:
[root@linuxserver ~]# tree /usr/local/nagios/etc/
/usr/local/nagios/etc/
├── cgi.cfg
├── htpasswd.users
├── nagios.cfg
├── objects
│ ├── commands.cfg
│ ├── contacts.cfg
│ ├── localhost.cfg
│ ├── printer.cfg
│ ├── switch.cfg
│ ├── templates.cfg
│ ├── timeperiods.cfg
│ └── windows.cfg
└── resource.cfg
序號 |
檔名 |
用途 |
1 |
cgi.cfg |
控制CGI訪問的配置檔案 |
2 |
nagios.cfg |
主配置檔案 |
3 |
resource.cfg |
變數定義檔案,定義變數,以便由其他配置檔案引用,如$USER1 |
4 |
commands.cfg |
命令定義配置檔案,其中定義的命令可以被其他配置檔案引用 |
5 |
contacts.cfg |
定義聯絡人和聯絡人組 |
6 |
localhost.cfg |
監控本地主機的配置檔案 |
7 |
printer.cfg |
定義監控印表機的一個配置檔案模板,預設沒有啟用此檔案 |
8 |
switch.cfg |
定義監控路由器的一個配置檔案模板,預設沒有啟用此檔案 |
9 |
templates.cfg |
定義主機和服務的一個模板配置檔案,可以在其他配置檔案中引用 |
10 |
timeperiods.cfg |
定義Nagios 監控時間段的配置檔案 |
11 |
windows.cfg |
監控Windows 主機的一個配置檔案模板,預設沒有啟用此檔案 |
下面對幾個重要的配置檔案進行說明。
2.5.1 nagios.cfg
[root@linuxserver ~]# grep -v '^#' /usr/local/nagios/etc/nagios.cfg | grep -v '^$'
log_file=/usr/local/nagios/var/nagios.log
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg
cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
object_cache_file=/usr/local/nagios/var/objects.cache
precached_object_file=/usr/local/nagios/var/objects.precache
resource_file=/usr/local/nagios/etc/resource.cfg
status_file=/usr/local/nagios/var/status.dat
status_update_interval=10
nagios_user=nagios
nagios_group=nagios
(省略了部分引數)
nagios.cfg是nagios的核心配置檔案,其中cfg_file變數用來引用物件配置檔案,如果有更多的物件配置檔案,須新增到此配置檔案才能生效。
2.5.2 cgi.cfg
[root@linuxserver ~]# grep -v '^#' /usr/local/nagios/etc/cgi.cfg | grep -v '^$'
main_config_file=/usr/local/nagios/etc/nagios.cfg
physical_html_path=/usr/local/nagios/share
url_html_path=/nagios
show_context_help=0
use_pending_states=1
use_authentication=1
use_ssl_authentication=0
authorized_for_system_information=nagiosadmin
authorized_for_configuration_information=nagiosadmin
authorized_for_system_commands=nagiosadmin
authorized_for_all_services=nagiosadmin
authorized_for_all_hosts=nagiosadmin
authorized_for_all_service_commands=nagiosadmin
authorized_for_all_host_commands=nagiosadmin
(省略了部分引數)
該配置檔案中的authorized*引數的值預設均為nagiosadmin,故前面為nagiosadmin生成密碼檔案,如果是使用其他的使用者名稱,則此處就需要在nagiosadmin後面加上其他的使用者名稱。各引數的含義參考配置檔案中的註釋。
2.5.3 resource.cfg
[root@linuxserver ~]# grep -v '^#' /usr/local/nagios/etc/resource.cfg | grep -v '^$'
$USER1$=/usr/local/nagios/libexec
該配置檔案中的變數$USER1$指定了安裝nagios外掛的路徑。
2.5.4 localhost.cfg
[root@linuxserver ~]# grep -v '^#' /usr/local/nagios/etc/objects/localhost.cfg | grep -v '^$'
define host{
use linux-server
host_name localhost
alias localhost
address 127.0.0.1
}
define hostgroup{
hostgroup_name linux-servers
alias Linux Servers
members localhost
}
define service{
use local-service
host_name localhost
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
define service{
use local-service
host_name localhost
service_description Root Partition
check_command check_local_disk!20%!10%!/
}
define service{
use local-service
host_name localhost
service_description Current Users
check_command check_local_users!20!50
}
define service{
use local-service
host_name localhost
service_description Total Processes
check_command check_local_procs!250!400!RSZDT
}
define service{
use local-service
host_name localhost
service_description Current Load
check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
}
define service{
use local-service
host_name localhost
service_description Swap Usage
check_command check_local_swap!20!10
}
define service{
use local-service
host_name localhost
service_description SSH
check_command check_ssh
notifications_enabled 0
}
define service{
use local-service
host_name localhost
service_description HTTP
check_command check_http
notifications_enabled 0
}
該配置檔案定義本機監控的引數及服務。其中,“linux-server”為在templates.cfg定義的主機模版,“local-service”為在templates.cfg定義的服務模版。
2.5.5 templates.cfg
[root@linuxserver ~]# sed -i 's/;.*$//g' /usr/local/nagios/etc/objects/templates.cfg
[root@linuxserver ~]# grep -v '^#' /usr/local/nagios/etc/objects/templates.cfg | grep -v '^$'
define contact{
name generic-contact
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r,f,s
host_notification_options d,u,r,f,s
service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-email
register 0
}
define host{
name generic-host
notifications_enabled 1
event_handler_enabled 1
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
notification_period 24x7
register 0
}
define host{
name linux-server
use generic-host
check_period 24x7
check_interval 5
retry_interval 1
max_check_attempts 10
check_command check-host-alive
notification_period workhours
notification_interval 120
notification_options d,u,r
contact_groups admins
register 0
}
define host{
name windows-server
use generic-host
check_period 24x7
check_interval 5
retry_interval 1
max_check_attempts 10
check_command check-host-alive
notification_period 24x7
notification_interval 30
notification_options d,r
contact_groups admins
hostgroups windows-servers
register 0
}
define host{
name generic-printer
use generic-host
check_period 24x7
check_interval 5
retry_interval 1
max_check_attempts 10
check_command check-host-alive
notification_period workhours
notification_interval 30
notification_options d,r
contact_groups admins
register 0
}
define host{
name generic-switch
use generic-host
check_period 24x7
check_interval 5
retry_interval 1
max_check_attempts 10
check_command check-host-alive
notification_period 24x7
notification_interval 30
notification_options d,r
contact_groups admins
register 0
}
define service{
name generic-service
active_checks_enabled 1
passive_checks_enabled 1
parallelize_check 1
obsess_over_service 1
check_freshness 0
notifications_enabled 1
event_handler_enabled 1
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 10
retry_check_interval 2
contact_groups admins
notification_options w,u,c,r
notification_interval 60
notification_period 24x7
register 0
}
define service{
name local-service
use generic-service
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
register 0
}
templates.cfg
該配置檔案定義通知,主機及服務模版。
2.5.6 commands.cfg
[root@linuxserver ~]# grep -v '^#' /usr/local/nagios/etc/objects/commands.cfg | grep -v '^$'
define command{
command_name notify-host-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
}
define command{
command_name notify-service-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
}
define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}
define command{
command_name check_local_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}
define command{
command_name check_local_load
command_line $USER1$/check_load -w $ARG1$ -c $ARG2$
}
define command{
command_name check_local_procs
command_line $USER1$/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$
}
define command{
command_name check_local_users
command_line $USER1$/check_users -w $ARG1$ -c $ARG2$
}
define command{
command_name check_local_swap
command_line $USER1$/check_swap -w $ARG1$ -c $ARG2$
}
define command{
command_name check_local_mrtgtraf
command_line $USER1$/check_mrtgtraf -F $ARG1$ -a $ARG2$ -w $ARG3$ -c $ARG4$ -e $ARG5$
}
define command{
command_name check_ftp
command_line $USER1$/check_ftp -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_hpjd
command_line $USER1$/check_hpjd -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_snmp
command_line $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_http
command_line $USER1$/check_http -I $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_ssh
command_line $USER1$/check_ssh $ARG1$ $HOSTADDRESS$
}
define command{
command_name check_dhcp
command_line $USER1$/check_dhcp $ARG1$
}
define command{
command_name check_ping
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
}
define command{
command_name check_pop
command_line $USER1$/check_pop -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_imap
command_line $USER1$/check_imap -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_smtp
command_line $USER1$/check_smtp -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_tcp
command_line $USER1$/check_tcp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$
}
define command{
command_name check_udp
command_line $USER1$/check_udp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$
}
define command{
command_name check_nt
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$
}
define command{
command_name process-host-perfdata
command_line /usr/bin/printf "%b" "$LASTHOSTCHECK$\t$HOSTNAME$\t$HOSTSTATE$\t$HOSTATTEMPT$\t$HOSTSTATETYPE$\t$HOSTEXECUTIONTIME$\t$HOSTOUTPUT$\t$HOSTPERFDATA$\n" >> /usr/local/nagios/var/host-perfdata.out
}
define command{
command_name process-service-perfdata
command_line /usr/bin/printf "%b" "$LASTSERVICECHECK$\t$HOSTNAME$\t$SERVICEDESC$\t$SERVICESTATE$\t$SERVICEATTEMPT$\t$SERVICESTATETYPE$\t$SERVICEEXECUTIONTIME$\t$SERVICELATENCY$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$\n" >> /usr/local/nagios/var/service-perfdata.out
}
此配置檔案定義監控服務使用的命令名稱及命令,引用了resource.cfg中對$USER1$的定義,在localhost.cfg中引用了其中的一些命令。
2.5.7 contacts.cfg
[root@linuxserver ~]# grep -v '^#' /usr/local/nagios/etc/objects/contacts.cfg | grep -v '^$'
define contact{
contact_name nagiosadmin
use generic-contact
alias Nagios Admin
email nagios@localhost
}
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
}
此配置檔案引用了templates.cfg中generic-contact的定義。
2.5.8 timeperiods.cfg
[root@linuxserver ~]# grep -v '^#' /usr/local/nagios/etc/objects/timeperiods.cfg | grep -v '^$'
define timeperiod{
timeperiod_name 24x7
alias 24 Hours A Day, 7 Days A Week
sunday 00:00-24:00
monday 00:00-24:00
tuesday 00:00-24:00
wednesday 00:00-24:00
thursday 00:00-24:00
friday 00:00-24:00
saturday 00:00-24:00
}
define timeperiod{
timeperiod_name workhours
alias Normal Work Hours
monday 09:00-17:00
tuesday 09:00-17:00
wednesday 09:00-17:00
thursday 09:00-17:00
friday 09:00-17:00
}
define timeperiod{
timeperiod_name none
alias No Time Is A Good Time
}
define timeperiod{
name us-holidays
timeperiod_name us-holidays
alias U.S. Holidays
january 1 00:00-00:00
monday -1 may 00:00-00:00
july 4 00:00-00:00
monday 1 september 00:00-00:00
thursday 4 november 00:00-00:00
december 25 00:00-00:00
}
define timeperiod{
timeperiod_name 24x7_sans_holidays
alias 24x7 Sans Holidays
use us-holidays ; Get holiday exceptions from other timeperiod
sunday 00:00-24:00
monday 00:00-24:00
tuesday 00:00-24:00
wednesday 00:00-24:00
thursday 00:00-24:00
friday 00:00-24:00
saturday 00:00-24:00
}
此配置檔案定義監控時間段,目前只使用第一個“24X7”。
2.6 安裝Nagios外掛
安裝配置好nagios後,使用瀏覽器訪問可以看到監控伺服器上的服務都沒有監控到,這是由於/usr/local/nagios/libexec/目錄下還沒有安裝外部外掛程式,接下來就安裝nagios外掛程式nagios-plugins。
[root@linuxserver ~]# ll /usr/local/nagios/libexec/
total 0
[root@linuxserver ~]# tar -xvzf nagios-plugins-2.0.3.tar.gz
[root@linuxserver ~]# cd nagios-plugins-2.0.3
[root@linuxserver nagios-plugins-2.0.3]# ./configure --prefix=/usr/local/nagios/
[root@linuxserver nagios-plugins-2.0.3]# make && make install
再次檢視/usr/local/nagios/libexec/目錄可以看到增加了很多外部外掛程式。
重啟nagios後重新整理頁面,就可以看到監控伺服器上面的服務狀態了。
如果HTTP服務出現“HTTP WARNING: HTTP/1.1 403 Forbidden”報錯,原因是nagios監控HTTP時,會監控到/var/www/html/下面的index.html檔案,若沒有就會提示錯誤,建立一個檔案即可!
[root@linuxserver ~]# touch /var/www/html/index.html
[root@linuxserver ~]# /etc/init.d/httpd restart
Stopping httpd: [ OK ]
Starting httpd: [ OK ]
3 Nagios監控linuxclient的安裝配置
3.1 原理
監控伺服器透過叫NRPE的附加元件對客戶端進行監控。
NRPE 總共由兩部分組成:
· check_nrpe 外掛,位於監控主機上
· NRPE daemon,執行在遠端的Linux主機上(通常就是被監控機)
按照上圖,整個的監控過程如下:
當Nagios 需要監控某個遠端Linux 主機的服務或者資源情況時:
1. Nagios 會執行check_nrpe 這個外掛,告訴它要檢查什麼;
2. check_nrpe 外掛會連線到遠端的NRPE daemon,所用的方式是SSL;
3. NRPE daemon 會執行相應的Nagios 外掛來執行檢查;
4. NRPE daemon 將檢查的結果返回給check_nrpe 外掛,外掛將其遞交給nagios做處理。
注意:NRPE daemon 需要Nagios 外掛安裝在遠端的Linux主機上,否則,daemon不能做任何的監控
3.2 客戶端基礎支援套件安裝
[root@linuxserver ~]# yum install -y gcc glibc glibc-common gd gd-devel xinetd openssl-devel
3.3 客戶端使用者設定
[root@linuxclient ~]# useradd nagios
[root@linuxclient ~]# passwd nagios
Changing password for user nagios.
New password:
BAD PASSWORD: it is too simplistic/systematic
BAD PASSWORD: is too simple
Retype new password:
passwd: all authentication tokens updated successfully.
3.4 客戶端安裝nagios外掛
[root@linuxclient ~]# tar -xvzf nagios-plugins-2.0.3.tar.gz
[root@linuxclient ~]# cd nagios-plugins-2.0.3
[root@linuxclient nagios-plugins-2.0.3]# ./configure --prefix=/usr/local/nagios
[root@linuxclient nagios-plugins-2.0.3]# make && make install
[root@linuxclient nagios-plugins-2.0.3]# chown nagios:nagios /usr/local/nagios/
[root@linuxclient nagios-plugins-2.0.3]# chown -R nagios:nagios /usr/local/nagios/libexec/
3.5 客戶端安裝配置NRPE
[root@linuxclient nagios-plugins-2.0.3]# cd
[root@linuxclient ~]# tar -xvzf nrpe-2.15.tar.gz
[root@linuxclient ~]# cd nrpe-2.15
[root@linuxclient nrpe-2.15]# ./configure
[root@linuxclient nrpe-2.15]# make all
[root@linuxclient nrpe-2.15]# make install-plugin
[root@linuxclient nrpe-2.15]# make install-daemon
[root@linuxclient nrpe-2.15]# make install-daemon-config
[root@linuxclient nrpe-2.15]# make install-xinetd
編輯/etc/xinetd.d/nrpe,為only-from引數增加監控伺服器地址。
[root@linuxclient nrpe-2.15]# cat /etc/xinetd.d/nrpe
# default: on
# description: NRPE (Nagios Remote Plugin Executor)
service nrpe
{
flags = REUSE
socket_type = stream
port = 5666
wait = no
user = nagios
group = nagios
server = /usr/local/nagios/bin/nrpe
server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd
log_on_failure += USERID
disable = no
only_from = 127.0.0.1 192.168.230.136
}
編輯/etc/services,增加NRPE服務。
[root@linuxclient nrpe-2.15]# tail -1 /etc/services
nrpe 5666/tcp # nrpe
重啟xinetd服務。
[root@linuxclient nrpe-2.15]# /etc/init.d/xinetd restart
Stopping xinetd: [FAILED]
Starting xinetd: [ OK ]
檢查nrpe是否啟動成功。
[root@linuxclient nrpe-2.15]# netstat -tunlp | grep 5666
tcp 0 0 :::5666 :::* LISTEN 42522/xinetd
可以看到nrpe服務啟動成功,但是是啟動在IPv6上,測試會報如下錯誤:
[root@linuxclient nrpe-2.15]# /usr/local/nagios/libexec/check_nrpe -H localhost
CHECK_NRPE: Error - Could not complete SSL handshake.
在/etc/modprobe.d/dist.conf中增加如下兩行,關閉IPv6,重啟後,再進行測試ok。
[root@linuxclient nrpe-2.15]# tail -2 /etc/modprobe.d/dist.conf
alias net-pf-10 off
options ipv6 disable=1
[root@linuxclient nrpe-2.15]# init 6
[root@linuxclient ~]# netstat -tunlp | grep 5666
tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 1729/xinetd
[root@linuxclient ~]# /usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v2.15
3.6 客戶端NRPE配置檔案
NRPE的配置檔案為nrpe.cfg,根據實際情況進行修改後內容如下:
[root@linuxclient ~]# grep -v '^#' /usr/local/nagios/etc/nrpe.cfg | grep -v '^$'
log_facility=daemon
pid_file=/var/run/nrpe.pid
server_port=5666
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=127.0.0.1
dont_blame_nrpe=0
allow_bash_command_substitution=0
debug=0
command_timeout=60
connection_timeout=300
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_sda3]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda3
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%
其中增加了swap的監控。客戶端到這兒就配置完了,接下來就需要到伺服器端增加對客戶端的監控內容。
3.7 伺服器端安裝NRPE
[root@linuxserver ~]# tar -xvzf nrpe-2.15.tar.gz
[root@linuxserver ~]# cd nrpe-2.15
[root@linuxserver nrpe-2.15]# ./configure
[root@linuxserver nrpe-2.15]# make all
[root@linuxserver nrpe-2.15]# make install-plugin
測試以下伺服器端的check_nrpe與客戶端的nrpe daemon之間的通訊。
[root@linuxserver nrpe-2.15]# /usr/local/nagios/libexec/check_nrpe -H 192.168.230.137
NRPE v2.15
返回版本資訊,說明通訊正常。
3.8 伺服器端配置檔案修改
3.8.1 command.cfg
新增check_nrpe命令,命令的用法可以使用check_nrpe –h檢視。
[root@linuxserver etc]# tail -5 objects/commands.cfg
# 'check_nrpe' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
3.8.2 services.cfg
新增加一個services.cfg檔案,新增對linuxclient客戶端監控的監控內容。
[root@linuxserver etc]# cat objects/services.cfg
define service{
use local-service
host_name linuxclient
service_description check-host-alive
check_command check-host-alive
}
define service{
use local-service
host_name linuxclient
service_description Current Load
check_command check_nrpe!check_load
}
define service{
use local-service
host_name linuxclient
service_description Check Disk sda3
check_command check_nrpe!check_sda3
}
define service{
use local-service
host_name linuxclient
service_description Total Processes
check_command check_nrpe!check_total_procs
}
define service{
use local-service
host_name linuxclient
service_description Current Users
check_command check_nrpe!check_users
}
define service{
use local-service
host_name linuxclient
service_description Check Zombie Procs
check_command check_nrpe!check_zombie_procs
}
define service{
use local-service
host_name linuxclient
service_description Check Swap
check_command check_nrpe!check_swap
}
3.8.3 hosts.cfg
新增加一個hosts.cfg檔案,定義被監控客戶端的地址及相關屬性資訊。
[root@linuxserver etc]# cat /usr/local/nagios/etc/objects/hosts.cfg
define host{
use linux-server
host_name linuxclient
alias linuxclient
address 192.168.230.137
}
define hostgroup{
hostgroup_name bsmart-servers
alias bsmart servers
members linuxclient
}
3.8.4 nagios.cfg
在nagios.cfg中增加services.cfg和hosts.cfg配置檔案條目。
[root@linuxserver etc]# grep 'hosts.cfg' nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/hosts.cfg
[root@linuxserver etc]# grep 'services.cfg' nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/services.cfg
3.9 配置檔案關係圖
最終的配置檔案關係如下圖:
3.10 伺服器端重啟服務
[root@linuxserver etc]# /etc/init.d/nagios restart
Running configuration check...
Stopping nagios: done.
Starting nagios: done.
重啟後過一會就可以看到客戶端的情況了。
參考了http://www.cnblogs.com/mchina/archive/2013/02/20/2883404.html,謝謝哦!
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/28536251/viewspace-1460822/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- nginx下搭建nagios監控環境NginxiOS
- Nagios+Centreon監控系統簡介iOS
- Nagios 監控系統架設全攻略iOS
- python搭建系統監控Python
- 搭建完美的監控系統
- 監控系統:深度對比Zabbix、Nagios、Pandora FMSiOS
- nagios監控例項 -- PostgreSQL監控iOSSQL
- 前端監控系統Sentry搭建前端
- 搭建前端錯誤監控系統前端
- nagios批量新增監控iOS
- 使用nagios監控oracleiOSOracle
- 前端監控基礎篇 — Docker + Sentry 搭建前端監控系統前端Docker
- nagios的配置(監控端和被監控端)iOS
- nagios監控例項 -- Windows伺服器監控iOSWindows伺服器
- 運維監控利器nagios運維iOS
- Nagios 監控ESXI指令碼iOS指令碼
- 轉轉支付通道監控系統的搭建
- nagios監控華為5700交換機iOS
- nagios監控drbd同步狀態iOS
- Nagios for Aix監控客戶端iOSAI客戶端
- 將Nagios監控資訊存入MySQLiOSMySql
- nagios監控例項 -- 伺服器基本狀況監控iOS伺服器
- grafana+prometheus快速搭建MySql監控系統實踐GrafanaPrometheusMySql
- docker-compose 搭建 Prometheus+Grafana監控系統DockerPrometheusGrafana
- 搭建一個前端監控系統,不再錯過BUG前端
- Mysql 監控系統MySql
- 監控系統元件元件
- nagios監控linux主機監控記憶體指令碼iOSLinux記憶體指令碼
- Nagios監控lvs服務iOS
- nagios監控 ogg同步狀態iOS
- nagios-新增記憶體監控iOS記憶體
- Nagios使用check_mysql監控mysqliOSMySql
- nagios監控頻寬外掛薦iOS
- MySQL伺服器部署nagios監控MySql伺服器iOS
- 能源管控系統開發解決方案,線上監測系統搭建
- 實時監控系統,統一監控企業APIAPI
- docker-compose快速搭建 Prometheus+Grafana監控系統DockerPrometheusGrafana
- 能耗線上管理平臺搭建能源監控系統開發