1、介紹
Keeaplived 主要有兩種應用場景,一個是通過配置keepalived結合ipvs做到負載均衡(LVS+Keepalived)。另一個是通過自身健康檢查、資源接管功能做高可用(雙機熱備),實現故障轉移。
以下內容主要針對Keepalived+MySQL雙主實現雙機熱備為根據,主要講解keepalived的狀態轉換通知功能,利用此功能可有效加強對MySQL資料庫監控。此文不再講述Keepalived+MySQL雙主部署過程,有需求者可參考以往博文:http://blog.jobbole.com/94643/
2、keepalived主要作用
keepalived採用VRRP(virtual router redundancy protocol),虛擬路由冗餘協議,以軟體的形式實現伺服器熱備功能。通常情況下是將兩臺linux伺服器組成一個熱備組(master-backup),同一時間熱備組內只有一臺主伺服器(master)提供服務,同時master會虛擬出一個共用IP地址(VIP),這個VIP只存在master上並對外提供服務。如果keepalived檢測到master當機或服務故障,備伺服器(backup)會自動接管VIP成為master,keepalived並將master從熱備組移除,當master恢復後,會自動加入到熱備組,預設再搶佔成為master,起到故障轉移功能。
3、工作在三層、四層和七層原理
Layer3:工作在三層時,keepalived會定期向熱備組中的伺服器傳送一個ICMP資料包,來判斷某臺伺服器是否故障,如果故障則將這臺伺服器從熱備組移除。
Layer4:工作在四層時,keepalived以TCP埠的狀態判斷伺服器是否故障,比如檢測mysql 3306埠,如果故障則將這臺伺服器從熱備組移除。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
示例: ! Configuration File for keepalived global_defs { notification_email { example@163.com } notification_email_from example@example.com smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id MYSQL_HA } vrrp_instance VI_1 { state BACKUP interface eth1 virtual_router_id 50 nopreempt #當主down時,備接管,主恢復,不自動接管 priority 100 advert_int 1 authentication { auth_type PASS ahth_pass 123 } virtual_ipaddress { 192.168.1.200 #虛擬IP地址 } } virtual_server 192.168.1.200 3306 { delay_loop 6 # lb_algo rr # lb_kind NAT persistence_timeout 50 protocol TCP real_server 192.168.1.201 3306 { #監控本機3306埠 weight 1 notify_down /etc/keepalived/kill_keepalived.sh #檢測3306埠為down狀態就執行此指令碼(只有keepalived關閉,VIP才漂移 ) TCP_CHECK { #健康狀態檢測方式,可針對業務需求調整(TTP_GET|SSL_GET|TCP_CHECK|SMTP_CHECK|MISC_CHECK) connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } } |
Layer7:工作在七層時,keepalived根據使用者設定的策略判斷伺服器上的程式是否正常執行,如果故障則將這臺伺服器從熱備組移除。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
示例: ! Configuration File for keepalived global_defs { notification_email { example@163.com } notification_email_from example@example.com smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id MYSQL_HA } vrrp_script check_nginx { script /etc/keepalived/check_nginx.sh #檢測指令碼 interval 2 #執行間隔時間 } vrrp_instance VI_1 { state BACKUP interface eth1 virtual_router_id 50 nopreempt #當主down時,備接管,主恢復,不自動接管 priority 100 advert_int 1 authentication { auth_type PASS ahth_pass 123 } virtual_ipaddress { 192.168.1.200 #虛擬IP地址 } track_script { #在例項中引用指令碼 check_nginx } } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
指令碼內容如下: # cat /etc/keepalived/check_nginx.sh Count1=`netstat -antp |grep -v grep |grep nginx |wc -l` if [ $Count1 -eq 0 ]; then /usr/local/nginx/sbin/nginx sleep 2 Count2=`netstat -antp |grep -v grep |grep nginx |wc -l` if [ $Count2 -eq 0 ]; then service keepalived stop else exit 0 fi else exit 0 fi |
4、健康狀態檢測方式
4.1 HTTP服務狀態檢測
1 2 3 4 5 6 7 8 9 10 11 12 |
HTTP_GET或SSL_GET { url { path /index.html #檢測url,可寫多個 digest 24326582a86bee478bac72d5af25089e #檢測效驗碼 #digest效驗碼獲取方法:genhash -s IP -p 80 -u http://IP/index.html status_code 200 #檢測返回http狀態碼 } connect_port 80 #連線埠 connect_timeout 3 #連線超時時間 nb_get_retry 3 #重試次數 delay_before_retry 2 #連線間隔時間 } |
4.2 TCP埠狀態檢測(使用TCP埠服務基本上都可以使用)
1 2 3 4 5 6 |
TCP_CHECK { connect_port 80 #健康檢測埠,預設為real_server後跟埠 connect_timeout 5 nb_get_retry 3 delay_before_retry 3 } |
4.3 郵件伺服器SMTP檢測
1 2 3 4 5 6 7 8 9 10 |
SMTP_CHECK { #健康檢測郵件伺服器smtp host { connect_ip connect_port } connect_timeout 5 retry 2 delay_before_retry 3 hello_name "mail.domain.com" } |
4.4 使用者自定義指令碼檢測real_server服務狀態
1 2 3 4 5 |
MISC_CHECK { misc_path /script.sh #指定外部程式或指令碼位置 misc_timeout 3 #執行指令碼超時時間 !misc_dynamic #不動態調整伺服器權重(weight),如果啟用將通過退出狀態碼動態調整real_server權重值 } |
5、狀態轉換通知功能
keepalived主配置郵件通知功能,預設當real_server當機或者恢復時才會發出郵件。有時我們更想知道keepalived的主伺服器故障切換後,VIP是否順利漂移到備伺服器,MySQL伺服器是否正常?那寫個監控指令碼吧,可以,但沒必要,因為keepalived具備狀態檢測功能,所以我們直接使用就行了。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
主配置預設郵件通知配置模板如下: global_defs # Block id { notification_email # To: { admin@example1.com ... } # From: from address that will be in header notification_email_from admin@example.com smtp_server 127.0.0.1 # IP smtp_connect_timeout 30 # integer, seconds router_id my_hostname # string identifying the machine, # (doesn't have to be hostname). enable_traps # enable SNMP traps } |
5.1 例項狀態通知
a) notify_master :節點變為master時執行
b) notify_backup : 節點變為backup時執行
c) notify_fault : 節點變為故障時執行
5.2 虛擬伺服器檢測通知
a) notify_up : 虛擬伺服器up時執行
b) notify_down : 虛擬伺服器down時執行
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
示例: ! Configuration File for keepalived global_defs { notification_email { example@163.com } notification_email_from example@example.com smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id MYSQL_HA } vrrp_instance VI_1 { state BACKUP interface eth1 virtual_router_id 50 nopreempt #當主down時,備接管,主恢復,不自動接管 priority 100 advert_int 1 authentication { auth_type PASS ahth_pass 123 } virtual_ipaddress { 192.168.1.200 } notify_master /etc/keepalived/to_master.sh notify_backup /etc/keepalived/to_backup.sh notify_fault /etc/keepalived/to_fault.sh } virtual_server 192.168.1.200 3306 { delay_loop 6 persistence_timeout 50 protocol TCP real_server 192.168.1.201 3306 { weight 1 notify_up /etc/keepalived/mysql_up.sh notify_down /etc/keepalived/mysql_down.sh TCP_CHECK { connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } } |
狀態引數後可以是bash命令,也可以是shell指令碼,內容根據自己需求定義,以上示例中所涉及狀態指令碼如下:
1) 當伺服器改變為主時執行此指令碼
1 2 3 4 5 6 |
# cat to_master.sh #!/bin/bash Date=$(date +%F" "%T) IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}') Mail="baojingtongzhi@163.com" echo "$Date $IP change to master." |mail -s "Master-Backup Change Status" $Mail |
2) 當伺服器改變為備時執行此指令碼
1 2 3 4 5 6 |
# cat to_backup.sh #!/bin/bash Date=$(date +%F" "%T) IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}') Mail="baojingtongzhi@163.com" echo "$Date $IP change to backup." |mail -s "Master-Backup Change Status" $Mail |
3) 當伺服器改變為故障時執行此指令碼
1 2 3 4 5 6 |
# cat to_fault.sh #!/bin/bash Date=$(date +%F" "%T) IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}') Mail="baojingtongzhi@163.com" echo "$Date $IP change to fault." |mail -s "Master-Backup Change Status" $Mail |
4) 當檢測TCP埠3306為不可用時,執行此指令碼,殺死keepalived,實現切換
1 2 3 4 5 6 7 |
# cat mysql_down.sh #!/bin/bash Date=$(date +%F" "%T) IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}') Mail="baojingtongzhi@163.com" pkill keepalived echo "$Date $IP The mysql service failure,kill keepalived." |mail -s "Master-Backup MySQL Monitor" $Mail |
5) 當檢測TCP埠3306可用時,執行此指令碼
1 2 3 4 5 6 |
# cat mysql_up.sh #!/bin/bash Date=$(date +%F" "%T) IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}') Mail="baojingtongzhi@163.com" echo "$Date $IP The mysql service is recovery." |mail -s "Master-Backup MySQL Monitor" $Mail |