Keepalived 叢集軟體高階使用(工作原理和狀態通知)

發表於2015-11-26

1、介紹

Keeaplived 主要有兩種應用場景，一個是通過配置keepalived結合ipvs做到負載均衡（LVS+Keepalived）。另一個是通過自身健康檢查、資源接管功能做高可用（雙機熱備），實現故障轉移。

以下內容主要針對Keepalived+MySQL雙主實現雙機熱備為根據，主要講解keepalived的狀態轉換通知功能，利用此功能可有效加強對MySQL資料庫監控。此文不再講述Keepalived+MySQL雙主部署過程，有需求者可參考以往博文：http://blog.jobbole.com/94643/

2、keepalived主要作用

keepalived採用VRRP（virtual router redundancy protocol），虛擬路由冗餘協議，以軟體的形式實現伺服器熱備功能。通常情況下是將兩臺linux伺服器組成一個熱備組（master-backup），同一時間熱備組內只有一臺主伺服器（master）提供服務，同時master會虛擬出一個共用IP地址（VIP），這個VIP只存在master上並對外提供服務。如果keepalived檢測到master當機或服務故障，備伺服器（backup）會自動接管VIP成為master，keepalived並將master從熱備組移除，當master恢復後，會自動加入到熱備組，預設再搶佔成為master，起到故障轉移功能。

3、工作在三層、四層和七層原理

Layer3：工作在三層時，keepalived會定期向熱備組中的伺服器傳送一個ICMP資料包，來判斷某臺伺服器是否故障，如果故障則將這臺伺服器從熱備組移除。

Layer4：工作在四層時，keepalived以TCP埠的狀態判斷伺服器是否故障，比如檢測mysql 3306埠，如果故障則將這臺伺服器從熱備組移除。

示例：
! Configuration File for keepalived
global_defs {
   notification_email {
     example@163.com
   }
   notification_email_from  example@example.com
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id MYSQL_HA
}
vrrp_instance VI_1 {
    state BACKUP
    interface eth1
    virtual_router_id 50
    nopreempt                   #當主down時，備接管，主恢復，不自動接管
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        ahth_pass 123
    }
    virtual_ipaddress {
        192.168.1.200          #虛擬IP地址
    }
}
virtual_server 192.168.1.200 3306 {        
    delay_loop 6
#    lb_algo rr 
#    lb_kind NAT
    persistence_timeout 50
    protocol TCP
    real_server 192.168.1.201 3306 {       #監控本機3306埠
        weight 1
        notify_down /etc/keepalived/kill_keepalived.sh   #檢測3306埠為down狀態就執行此指令碼（只有keepalived關閉，VIP才漂移 ） 
        TCP_CHECK {         #健康狀態檢測方式，可針對業務需求調整（TTP_GET|SSL_GET|TCP_CHECK|SMTP_CHECK|MISC_CHECK）
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
}

示例：

! Configuration File for keepalived

global_defs {

notification_email {

example@163.com

}

notification_email_from example@example.com

smtp_server 127.0.0.1

smtp_connect_timeout 30

router_id MYSQL_HA

}

vrrp_instance VI_1 {

state BACKUP

interface eth1

virtual_router_id 50

nopreempt #當主down時，備接管，主恢復，不自動接管

priority 100

advert_int 1

authentication {

auth_type PASS

ahth_pass 123

}

virtual_ipaddress {

192.168.1.200 #虛擬IP地址

}

virtual_server 192.168.1.200 3306 {

delay_loop 6

# lb_algo rr

# lb_kind NAT

persistence_timeout 50

protocol TCP

real_server 192.168.1.201 3306 { #監控本機3306埠

weight 1

notify_down /etc/keepalived/kill_keepalived.sh #檢測3306埠為down狀態就執行此指令碼（只有keepalived關閉，VIP才漂移）

TCP_CHECK { #健康狀態檢測方式，可針對業務需求調整（TTP_GET|SSL_GET|TCP_CHECK|SMTP_CHECK|MISC_CHECK）

connect_timeout 3

nb_get_retry 3

delay_before_retry 3

}

Layer7：工作在七層時，keepalived根據使用者設定的策略判斷伺服器上的程式是否正常執行，如果故障則將這臺伺服器從熱備組移除。

示例：
! Configuration File for keepalived
global_defs {
   notification_email {
     example@163.com
   }
   notification_email_from  example@example.com
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id MYSQL_HA
}
vrrp_script check_nginx {
    script /etc/keepalived/check_nginx.sh    #檢測指令碼
    interval 2   #執行間隔時間
}
vrrp_instance VI_1 {
    state BACKUP
    interface eth1
    virtual_router_id 50
    nopreempt                   #當主down時，備接管，主恢復，不自動接管
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        ahth_pass 123
    }
    virtual_ipaddress {
        192.168.1.200          #虛擬IP地址
    }
    track_script {          #在例項中引用指令碼
        check_nginx
    }
}

示例：

! Configuration File for keepalived

global_defs {

notification_email {

example@163.com

}

notification_email_from example@example.com

smtp_server 127.0.0.1

smtp_connect_timeout 30

router_id MYSQL_HA

}

vrrp_script check_nginx {

script /etc/keepalived/check_nginx.sh #檢測指令碼

interval 2 #執行間隔時間

}

vrrp_instance VI_1 {

state BACKUP

interface eth1

virtual_router_id 50

nopreempt #當主down時，備接管，主恢復，不自動接管

priority 100

advert_int 1

authentication {

auth_type PASS

ahth_pass 123

}

virtual_ipaddress {

192.168.1.200 #虛擬IP地址

}

track_script { #在例項中引用指令碼

check_nginx

}

指令碼內容如下：
# cat /etc/keepalived/check_nginx.sh
Count1=`netstat -antp |grep -v grep |grep nginx |wc -l`
if [ $Count1 -eq 0 ]; then
    /usr/local/nginx/sbin/nginx
    sleep 2
    Count2=`netstat -antp |grep -v grep |grep nginx |wc -l`
    if [ $Count2 -eq 0 ]; then
        service keepalived stop
    else
        exit 0
    fi 
else
    exit 0
fi

指令碼內容如下：

# cat /etc/keepalived/check_nginx.sh

Count1=`netstat -antp |grep -v grep |grep nginx |wc -l`

if [ $Count1 -eq 0 ]; then

/usr/local/nginx/sbin/nginx

sleep 2

Count2=`netstat -antp |grep -v grep |grep nginx |wc -l`

if [ $Count2 -eq 0 ]; then

service keepalived stop

else

exit 0

else

exit 0

4、健康狀態檢測方式

4.1 HTTP服務狀態檢測

  HTTP_GET或SSL_GET {    
      url {
          path /index.html        #檢測url，可寫多個
          digest  24326582a86bee478bac72d5af25089e    #檢測效驗碼
          #digest效驗碼獲取方法：genhash -s IP -p 80 -u http://IP/index.html 
          status_code 200         #檢測返回http狀態碼
      }
      connect_port 80 #連線埠
      connect_timeout 3  #連線超時時間
      nb_get_retry 3  #重試次數
      delay_before_retry 2 #連線間隔時間
   }

HTTP_GET或SSL_GET {

url {

path /index.html #檢測url，可寫多個

digest 24326582a86bee478bac72d5af25089e #檢測效驗碼

#digest效驗碼獲取方法：genhash -s IP -p 80 -u http://IP/index.html

status_code 200 #檢測返回http狀態碼

}

connect_port 80 #連線埠

connect_timeout 3 #連線超時時間

nb_get_retry 3 #重試次數

delay_before_retry 2 #連線間隔時間

}

4.2 TCP埠狀態檢測（使用TCP埠服務基本上都可以使用）

  TCP_CHECK {    
      connect_port 80     #健康檢測埠，預設為real_server後跟埠
      connect_timeout 5
      nb_get_retry 3
      delay_before_retry 3
  }

TCP_CHECK {

connect_port 80 #健康檢測埠，預設為real_server後跟埠

connect_timeout 5

nb_get_retry 3

delay_before_retry 3

}

4.3 郵件伺服器SMTP檢測

  SMTP_CHECK {            #健康檢測郵件伺服器smtp    
      host {
          connect_ip
          connect_port
      }
      connect_timeout 5
      retry 2
      delay_before_retry 3
      hello_name "mail.domain.com"
  }

SMTP_CHECK { #健康檢測郵件伺服器smtp

host {

connect_ip

connect_port

}

connect_timeout 5

retry 2

delay_before_retry 3

hello_name "mail.domain.com"

}

4.4 使用者自定義指令碼檢測real_server服務狀態

  MISC_CHECK {    
      misc_path /script.sh    #指定外部程式或指令碼位置
      misc_timeout 3      #執行指令碼超時時間
      !misc_dynamic       #不動態調整伺服器權重（weight），如果啟用將通過退出狀態碼動態調整real_server權重值
  }

MISC_CHECK {

misc_path /script.sh #指定外部程式或指令碼位置

misc_timeout 3 #執行指令碼超時時間

!misc_dynamic #不動態調整伺服器權重（weight），如果啟用將通過退出狀態碼動態調整real_server權重值

}

5、狀態轉換通知功能

keepalived主配置郵件通知功能，預設當real_server當機或者恢復時才會發出郵件。有時我們更想知道keepalived的主伺服器故障切換後，VIP是否順利漂移到備伺服器，MySQL伺服器是否正常？那寫個監控指令碼吧，可以，但沒必要，因為keepalived具備狀態檢測功能，所以我們直接使用就行了。

主配置預設郵件通知配置模板如下：
global_defs           # Block id
    {
    notification_email    # To:
        {
        admin@example1.com
        ...
         }
    # From: from address that will be in header
    notification_email_from admin@example.com
    smtp_server 127.0.0.1   # IP
    smtp_connect_timeout 30 # integer, seconds
    router_id my_hostname   # string identifying the machine,
                            # (doesn't have to be hostname).
    enable_traps            # enable SNMP traps
        }

主配置預設郵件通知配置模板如下：

global_defs # Block id

{

notification_email # To:

{

admin@example1.com

...

}

# From: from address that will be in header

notification_email_from admin@example.com

smtp_server 127.0.0.1 # IP

smtp_connect_timeout 30 # integer, seconds

router_id my_hostname # string identifying the machine,

# (doesn't have to be hostname).

enable_traps # enable SNMP traps

}

5.1 例項狀態通知

a) notify_master ：節點變為master時執行

b) notify_backup ：節點變為backup時執行

c) notify_fault ：節點變為故障時執行

5.2 虛擬伺服器檢測通知

a) notify_up ：虛擬伺服器up時執行

b) notify_down ：虛擬伺服器down時執行

示例：
    ! Configuration File for keepalived
    global_defs {
       notification_email {
         example@163.com
       }
       notification_email_from example@example.com 
       smtp_server 127.0.0.1
       smtp_connect_timeout 30
       router_id MYSQL_HA
    }
    vrrp_instance VI_1 {
        state BACKUP
        interface eth1
        virtual_router_id 50
        nopreempt           #當主down時，備接管，主恢復，不自動接管
        priority 100
        advert_int 1
        authentication {
            auth_type PASS
            ahth_pass 123
        }
        virtual_ipaddress {
            192.168.1.200
        }
            notify_master /etc/keepalived/to_master.sh
            notify_backup /etc/keepalived/to_backup.sh
            notify_fault /etc/keepalived/to_fault.sh
    }
    virtual_server 192.168.1.200 3306 {
        delay_loop 6
        persistence_timeout 50
        protocol TCP
        real_server 192.168.1.201 3306 {
            weight 1
            notify_up /etc/keepalived/mysql_up.sh
            notify_down /etc/keepalived/mysql_down.sh    
            TCP_CHECK {
                connect_timeout 3
                nb_get_retry 3
                delay_before_retry 3
            }
        }
    }

示例：

! Configuration File for keepalived

global_defs {

notification_email {

example@163.com

}

notification_email_from example@example.com

smtp_server 127.0.0.1

smtp_connect_timeout 30

router_id MYSQL_HA

}

vrrp_instance VI_1 {

state BACKUP

interface eth1

virtual_router_id 50

nopreempt #當主down時，備接管，主恢復，不自動接管

priority 100

advert_int 1

authentication {

auth_type PASS

ahth_pass 123

}

virtual_ipaddress {

192.168.1.200

}

notify_master /etc/keepalived/to_master.sh

notify_backup /etc/keepalived/to_backup.sh

notify_fault /etc/keepalived/to_fault.sh

}

virtual_server 192.168.1.200 3306 {

delay_loop 6

persistence_timeout 50

protocol TCP

real_server 192.168.1.201 3306 {

weight 1

notify_up /etc/keepalived/mysql_up.sh

notify_down /etc/keepalived/mysql_down.sh

TCP_CHECK {

connect_timeout 3

nb_get_retry 3

delay_before_retry 3

}

狀態引數後可以是bash命令，也可以是shell指令碼，內容根據自己需求定義，以上示例中所涉及狀態指令碼如下：

1) 當伺服器改變為主時執行此指令碼

# cat to_master.sh 
#!/bin/bash
Date=$(date +%F" "%T)
IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}')
Mail="baojingtongzhi@163.com"
echo "$Date $IP change to master." |mail -s "Master-Backup Change Status" $Mail

# cat to_master.sh

#!/bin/bash

Date=$(date +%F" "%T)

IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}')

Mail="baojingtongzhi@163.com"

echo "$Date $IP change to master." |mail -s "Master-Backup Change Status" $Mail

2) 當伺服器改變為備時執行此指令碼

# cat to_backup.sh
#!/bin/bash
Date=$(date +%F" "%T)
IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}')
Mail="baojingtongzhi@163.com"
echo "$Date $IP change to backup." |mail -s "Master-Backup Change Status" $Mail

# cat to_backup.sh

#!/bin/bash

Date=$(date +%F" "%T)

IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}')

Mail="baojingtongzhi@163.com"

echo "$Date $IP change to backup." |mail -s "Master-Backup Change Status" $Mail

3) 當伺服器改變為故障時執行此指令碼

# cat to_fault.sh
#!/bin/bash
Date=$(date +%F" "%T)
IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}')
Mail="baojingtongzhi@163.com"
echo "$Date $IP change to fault." |mail -s "Master-Backup Change Status" $Mail

# cat to_fault.sh

#!/bin/bash

Date=$(date +%F" "%T)

IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}')

Mail="baojingtongzhi@163.com"

echo "$Date $IP change to fault." |mail -s "Master-Backup Change Status" $Mail

4) 當檢測TCP埠3306為不可用時，執行此指令碼，殺死keepalived，實現切換

# cat mysql_down.sh
#!/bin/bash
Date=$(date +%F" "%T)
IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}')
Mail="baojingtongzhi@163.com"
pkill keepalived
echo "$Date $IP The mysql service failure,kill keepalived." |mail -s "Master-Backup MySQL Monitor" $Mail

# cat mysql_down.sh

#!/bin/bash

Date=$(date +%F" "%T)

IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}')

Mail="baojingtongzhi@163.com"

pkill keepalived

echo "$Date $IP The mysql service failure,kill keepalived." |mail -s "Master-Backup MySQL Monitor" $Mail

5) 當檢測TCP埠3306可用時，執行此指令碼

# cat mysql_up.sh
#!/bin/bash
Date=$(date +%F" "%T)
IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}')
Mail="baojingtongzhi@163.com"
echo "$Date $IP The mysql service is recovery." |mail -s "Master-Backup MySQL Monitor" $Mail

# cat mysql_up.sh

#!/bin/bash

Date=$(date +%F" "%T)

IP=$(ifconfig eth0 |grep "inet addr" |cut -d":" -f2 |awk '{print $1}')

Mail="baojingtongzhi@163.com"

echo "$Date $IP The mysql service is recovery." |mail -s "Master-Backup MySQL Monitor" $Mail

Keepalived叢集軟體高階使用(工作原理和狀態通知)
2016-02-28
Keepalived高可用叢集工作原理示意圖
2018-03-22
3.RabbitMQ高階叢集搭建(Haproxy負載均衡、Keepalived高可用)
2024-04-12
MQ負載
PostgreSQL repmgr高可用叢集+keepalived高可用
2020-09-02
SQL
RabbitMQ和Kafka的高可用叢集原理
2020-09-11
MQKafka
LNMP 分散式叢集（六）：keepalived 高可用方案
2020-03-18
LNMP分散式
基於 HAProxy + KeepAlived 搭建 RabbitMQ 高可用叢集
2020-01-06
MQ
Keepalived+Nginx高可用叢集搭建筆記
2019-09-30
Nginx筆記
MySQL叢集搭建(6)-雙主+keepalived高可用
2019-01-28
MySql
Haproxy+Keepalived高可用負載均衡叢集
2020-12-11
負載
教你如何用Keepalived和HAproxy配置高可用 Kubernetes 叢集
2024-02-28
RabbitMQ高階指南：從配置、使用到高可用叢集搭
2017-05-17
MQ
RabbitMQ高階指南：從配置、使用到高可用叢集搭建
2017-12-21
MQ
搭建高可用kubernetes叢集(keepalived+haproxy)
2020-07-20
Linux 高可用（HA）叢集之keepalived詳解
2015-03-27
Linux
使用Nginx+Keepalived組建高可用負載平衡Web server叢集
2015-12-12
Nginx負載WebServer
Oracle叢集軟體管理-新增和刪除叢集節點
2020-03-19
Oracle
MySQL主主複製+Keepalived打造高可用MySQL叢集
2016-03-13
MySql
Hadoop周邊生態軟體和簡要工作原理(二)薦
2013-07-19
Hadoop
Redis 叢集原理與使用
2015-10-20
Redis
oracle RAC 診斷叢集狀態命令
2020-07-24
Oracle
高階圖形繪製軟體的原理猜想
2020-10-08
高可用Mysql架構_Haproxy+keepalived+mycat叢集的配置
2018-01-18
MySql架構
mysql主主複製+keepalived 打造高可用mysql叢集薦
2014-09-07
MySql
redis通訊與高可用叢集原理
2019-01-25
Redis
Android通知之狀態列通知
2014-02-12
Android
Nginx + Keepalived 高可用叢集部署
2023-03-09
Nginx
LVS+Keepalived高可用負載均衡叢集架構
2015-11-06
負載架構
OracleRAC管理之叢集狀態&資訊檢視
2010-07-12
Oracle
Nginx 高階篇（十一）叢集搭建實戰
2020-03-23
Nginx
Elasticsearch高階之-叢集搭建，資料分片
2022-05-14
Elasticsearch
redis叢集之主從複製叢集的原理和部署
2022-11-29
Redis
高可用服務之Keepalived郵件通知配置
2020-09-10
Elasticsearch 叢集和索引健康狀態及常見錯誤說明
2019-04-08
Elasticsearch索引
通過keepalived搭建MySQL雙主模式的高可用叢集系統
2017-10-17
MySql模式
LVS+keepalived DR模式配置高可用負載均衡叢集
2017-03-18
模式負載
參加NEC卓越軟體的高可用叢集軟體EXPROCESSCLUSTER釋出會記錄
2007-10-12
小知識：使用oracle使用者檢視RAC叢集資源狀態
2023-04-27
Oracle

Keepalived 叢集軟體高階使用(工作原理和狀態通知)

2、keepalived主要作用

3、工作在三層、四層和七層原理

4、健康狀態檢測方式

5、狀態轉換通知功能

相關文章