MySQL叢集搭建(6)-雙主+keepalived高可用

程淇銘發表於2019-01-28

雙主 + keepalived 是一個比較簡單的 MySQL 高可用架構,適用於中小 MySQL 叢集,今天就說說怎麼用 keepalived 做 MySQL 的高可用。

1 概述

1.1 keepalived 簡介

簡單地說,keepalived 就是通過管理 VIP 來實現機器的高可用的,在使用 keepalived 的情況下,只有一臺伺服器能夠提供服務(通過 VIP 來實現),當 Master 主機當機後,VIP 會自動飄移到另一臺伺服器

keepalived 採用 Master/Slave 模式, 在 Master 上設定配置檔案的 VIP,當 Master 當機後,VIP 自動漂移到另一臺 keepalived 伺服器上

keepalived 可以用來做各種軟體的高可用叢集,它會一直檢測伺服器的狀態,如果有一臺伺服器當機,或工作出現故障,keepalived 將檢測到,並將有故障的伺服器從系統中剔除,同時使用其他伺服器代替該伺服器的工作,當伺服器工作正常後 keepalived 自動將伺服器加入到伺服器群中。

1.2 keepalived 配合雙主

keepalived 使用預設配置只能做到主機級別的高可用,但是我們的 MySQL 要做高可用至少要增加以下功能

  • 能夠檢測 MySQL 服務狀態
  • 主節點 read_only=0,備節點 read_only=1
  • 切換時,備節點要等待主節點同步完成

所以,keepalived 實現 MySQL 高可用需要使用自定義指令碼來進行擴充套件

2 環境準備

2.1 資料庫環境

操作前已經準備好了一套主主架構資料庫,搭建方法參考 MySQL叢集搭建(2)-主主從模式

節點資訊

IP 系統 MySQL版本 節點 讀寫 說明
10.0.0.247 Centos6.5 3306 5.7.9 Master 讀寫 主節點
10.0.0.248 Centos6.5 3306 5.7.9 Standby 只讀,可切換為讀寫 備主節點

VIP 資訊

簡稱 VIP 型別
RW-VIP 10.0.0.237 讀寫VIP

Master 參考配置

[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_db/mysql.sock

[mysqld]
datadir = /data/mysql_db/test_db
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_db/mysql.sock
pid-file = /data/mysql_db/test_db/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2473306

default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0

auto_increment_offset = 1
auto_increment_increment = 2

#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_db/mysql-bin
log_bin_index = /data/mysql_log/test_db/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_db/mysql-relay-bin
relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
log_error = /data/mysql_log/test_db/mysql-error.log

#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%

#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1

Slave 參考配置

[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_db/mysql.sock

[mysqld]
datadir = /data/mysql_db/test_db
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_db/mysql.sock
pid-file = /data/mysql_db/test_db/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2483306

default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0

auto_increment_offset = 2
auto_increment_increment = 2

#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_db/mysql-bin
log_bin_index = /data/mysql_log/test_db/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_db/mysql-relay-bin
relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
log_error = /data/mysql_log/test_db/mysql-error.log

#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%

#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1

2.2 建立監控用的賬號

- 由於是測試環境,賬號密碼設定比較隨便
create user monitor@`localhost` identified by `monitor`;
grant all on *.* to monitor@`localhost`;
flush privileges;

2.3 安裝 keepalived

我們在 Master 和 Slave 上部署 keepalived

1). yum 安裝

如果有對應的 yum 源,直接安裝就可以了

yum install -y keepalived

2). 原始碼安裝

下載安裝包, 下載地址 keepalived, 使用 1.2.24 版本舉例

# 安裝依賴
yum install -y gcc popt-devel openssl openssl-devel libssl-dev libnl-devel popt-devel libnfnetlink-devel

# 下載包
wget http://www.keepalived.org/software/keepalived-1.2.24.tar.gz

# 解壓安裝
tar -xvz -f  keepalived-1.2.24.tar.gz
cd keepalived-1.2.24
./configure --prefix=/usr/local/keepalived
make && make install

cp /usr/local/keepalived/sbin/keepalived /usr/sbin/
cp /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/
cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
mkdir /etc/keepalived/
cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/

3 配置高可用

3.1 keepalived 配置

開啟 /etc/keepalived/keepalived.conf 檔案, 按照實際情況加上下面的配置

global_defs {
   router_id MYSQL_MM  # 標識
   vrrp_skip_check_adv_addr
   vrrp_strict        # 嚴格執行 VRRP 協議規範
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_script check_mysql {
    script "/bin/sh /etc/keepalived/keepalived_mysql_check.sh"  # 檢查指令碼
    interval 10  # 檢查週期
}

vrrp_instance MYSQL_MM {
    state BACKUP            # 都設為 BACKUP,避免起來後搶佔
    interface eth0          # 網路卡名稱,根據實際情況填寫
    virtual_router_id 243   # 用來區分 VRRP 組播的標記,取值 0-255
    priority 100
    advert_int 1
    nopreempt               # 設為非搶佔
    authentication {
        auth_type PASS
        auth_pass 1111
    }

    # Master 節點可以註釋掉下面語句,防止啟動 keepalived 的時候執行指令碼
    notify_master "/bin/sh /etc/keepalived/keepalived_mysql_start.sh"  # 變為 MASTER 時執行

    virtual_ipaddress {
        10.0.0.237
    }

    # Slave 節點可以註釋下面檢查指令碼,Slave 沒有必要一直檢查
    track_script {
        check_mysql
    }
}

3.2 配置檢查指令碼

開啟 /etc/keepalived/keepalived_mysql_check.sh, 寫入檢測指令碼

#!/bin/sh
# @Author: chengqm
# MySQL 檢測指令碼
MyPath=$(cd $(dirname $0); pwd)
cd $MyPath

ThisTime=`date `+%F %T``
log_file=`/var/log/keepalived_mysql.log`

# MySQL 連線方式,根據實際情況調整
export MYSQL_PWD=`monitor`
MYSQL_USER=`monitor`
MYSQL_SOCKET="/data/mysql_db/test_db/mysql.sock"
mysql_connect="mysql -u${MYSQL_USER} -S${MYSQL_SOCKET} "

# 美化輸出
function techo() {
    message=$1
    message_level=$2
    if [ -e $message_level ];then
        message_level=`info`
    fi
    echo "`date `+%F %T`` - [${message_level}] $message" >> $log_file
}

# 檢查函式, 正常返回 0
function check {
    ret=`$mysql_connect -N -e `select 1 as value``
    if [ $? -ne 0 ] || [ $ret -ne `1` ];then
        return 1
    else
        return 0
    fi
}

function read_only {
    param=$1
    $mysql_connect -e "set global read_only = ${param}"
    techo "設定是否只讀 read_only ${param}"
}

# 失效轉移
function failover {
    techo "開始執行失效轉移"
    # 1. 停止 keepalived
    killall keepalived

    # 2. 如果還能執行的話,設為 read_only
    read_only 1

    if [ $? -eq 0 ];then
        # 3. 如果還能執行,kill 所有的連線
        $mysql_connect -e "select concat(`KILL `,id,`;`) from information_schema.processlist where user!=`root` AND db is not null into outfile `/tmp/kill.txt.${ThisTime}`;"
        if [ $? -eq 0 ];then
            $mysql_connect -e "source /tmp/kill.txt.${ThisTime};"
        fi
    fi

    # 4. 其他操作,比如說自動關機

    techo "失效轉移執行成功,當前資料庫關閉訪問"
}

# 有問題檢查 4 次
for ((i=1; i<=4; i ++))  
do  
    check
    if [ $? -eq 0 ];then
        techo "MySQL is ok"
        # 正常退出指令碼
        exit 0
    else
        techo "Connection failed $i time(s)"
        sleep 1
    fi
done

techo `無法連線當前資料庫`

# 失效轉移
failover

注意:指令碼沒有經過嚴格測試,需要根據實際情況調整

3.3 配置提升為 Master 時執行的指令碼

開啟 /bin/sh /etc/keepalived/keepalived_mysql_start.sh", 寫入指令碼內容

#!/bin/sh
# @Author: chengqm
# keepalived 變為 Master 時執行
MyPath=$(cd $(dirname $0); pwd)
cd $MyPath

ThisTime=`date `+%F %T``
log_file=`/var/log/keepalived_mysql.log`

# MySQL 連線方式,根據實際情況調整
export MYSQL_PWD=`monitor`
MYSQL_USER=`monitor`
MYSQL_SOCKET="/data/mysql_db/test_db/mysql.sock"
mysql_connect="mysql -u${MYSQL_USER} -S${MYSQL_SOCKET} "

# 美化輸出
function techo() {
    message=$1
    message_level=$2
    if [ -e $message_level ];then
        message_level=`info`
    fi
    echo "`date `+%F %T`` - [${message_level}] $message" >> $log_file
}

# 檢查函式, 正常返回 0
function check {
    ret=`$mysql_connect -N -e `select 1 as value``
    if [ $? -ne 0 ] || [ $ret -ne `1` ];then
        return 1
    else
        return 0
    fi
}

# 獲取 slave status 的資訊
function slave_info() {
    tmp_file=/tmp/slave_info.tmp
    $mysql_connect -e `show slave statusG` > /tmp/slave_info.tmp
    slave_sql=`grep `Slave_SQL_Running:` $tmp_file | sed `s/s*//g` | tr "A-Z" "a-z"  | awk -F":" `{print $2}``
    seconds_behind_master=`grep `Seconds_Behind_Master:` $tmp_file | sed `s/s*//g` | tr "A-Z" "a-z"  | awk -F":" `{print $2}``

    master_log_file=`grep `Master_Log_File:` $tmp_file | head -1 | sed `s/s*//g` | tr "A-Z" "a-z"  | awk -F":" `{print $2}``
    master_log_pos=`grep `Read_Master_Log_Pos:` $tmp_file | sed `s/s*//g` | tr "A-Z" "a-z"  | awk -F":" `{print $2}``

    relay_master_log_file=`grep `Relay_Master_Log_File:` $tmp_file | sed `s/s*//g` | tr "A-Z" "a-z"  | awk -F":" `{print $2}``
    exec_master_log_pos=`grep `Exec_Master_Log_Pos:` $tmp_file | sed `s/s*//g` | tr "A-Z" "a-z"  | awk -F":" `{print $2}``

}

# 設定是否可讀
function read_only {
    param=$1
    $mysql_connect -e "set global read_only = ${param}"
    techo "設定是否只讀 read_only ${param}"
}

# 處理資料同步
function sync_master_log() {
    # 如果是資料一致性優先,等待同步完畢。如果是服務可用性優先,可以登出下面的程式碼
    slave_info
    if [ $slave_sql == "yes" ];then
        techo "當前同步位置 Master ${master_log_file} ${master_log_pos}"
        techo "等待同步到 Master ${master_log_file} ${master_log_pos}"
        $mysql_connect -e "select master_pos_wait(`$master_log_file`, $master_log_pos);" > /dev/null
        techo "同步完畢"
    fi
}

techo "當前資料庫提升為主庫"

check
if [ $? -ne 0 ];then
    techo "無法連線當前資料庫"
    exit 1
fi

# 等待同步
sync_master_log

# 設為可寫
read_only 0

注意:指令碼沒有經過嚴格測試,需要根據實際情況調整

3.4 啟動 keepalived

由於配置了 BACKUP 模式,所以兩個 keepalived 先起來的是主,先後在主備節點執行

/etc/init.d/keepalived start

檢查 /var/log/message 日誌,確認 keepalived 沒有報錯

檢查 Master IP 狀態, 確認設定了 VIP

[root@cluster01 shell]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:de:80:33 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.247/16 brd 10.0.255.255 scope global eth0
    inet 10.0.0.237/32 scope global eth0
    inet6 fe80::f816:3eff:fede:8033/64 scope link 
       valid_lft forever preferred_lft forever

檢查 MySQL 檢測指令碼執行情況,確認正常執行

[root@cluster01 ~]# tail -f /var/log/keepalived_mysql.log 
...
2019-01-28 15:04:18 - [info] MySQL is ok
2019-01-28 15:04:28 - [info] MySQL is ok

4 失效轉移測試

mytest 庫裡新建 nowdate 測試表,只有 idctime 欄位,然後每秒插入一條資料

[root@cluster03 ~]# while true; do date;mysql -h10.0.0.237 -P3306 -umytest -e `use mytest;insert into nowdate values (null, now());`; sleep 1;done
Mon Jan 28 15:04:26 CST 2019
Mon Jan 28 15:04:27 CST 2019
...

kill 掉 Master 程式

killall mysqld

檢視舊 Master 日誌

2019-01-28 15:04:48 - [info] MySQL is ok
2019-01-28 15:04:58 - [info] Connection failed 1 time(s)
2019-01-28 15:04:59 - [info] Connection failed 2 time(s)
2019-01-28 15:05:00 - [info] Connection failed 3 time(s)
2019-01-28 15:05:01 - [info] Connection failed 4 time(s)
2019-01-28 15:05:02 - [info] 無法連線當前資料庫
2019-01-28 15:05:02 - [info] 開始執行失效轉移
2019-01-28 15:05:02 - [info] 設定是否只讀 read_only 1
2019-01-28 15:05:02 - [info] 失效轉移執行成功,當前資料庫關閉訪問

檢視新 Master 日誌

2019-01-28 15:05:04 - [info] 當前資料庫提升為主庫
2019-01-28 15:05:04 - [info] 當前同步位置 Master mysql-bin.000015 32338
2019-01-28 15:05:04 - [info] 等待同步到 Master mysql-bin.000015 32338
2019-01-28 15:05:04 - [info] 同步完畢
2019-01-28 15:05:04 - [info] 設定是否只讀 read_only 0
2019-01-28 15:05:05 - [info] MySQL is ok

檢視新 Master IP,確認 VIP 已經飄過來了

[root@cluster02 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:66:7e:e8 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.248/16 brd 10.0.255.255 scope global eth0
    inet 10.0.0.237/32 scope global eth0
    inet6 fe80::f816:3eff:fe66:7ee8/64 scope link 
       valid_lft forever preferred_lft forever

檢視插入資料執行情況,大概有 12 秒是不可用的

Mon Jan 28 15:04:51 CST 2019
ERROR 2003 (HY000): Can`t connect to MySQL server on `10.0.0.237` (111)
Mon Jan 28 15:04:52 CST 2019
ERROR 2003 (HY000): Can`t connect to MySQL server on `10.0.0.237` (111)
Mon Jan 28 15:04:53 CST 2019
ERROR 2003 (HY000): Can`t connect to MySQL server on `10.0.0.237` (111)
Mon Jan 28 15:04:54 CST 2019
ERROR 2003 (HY000): Can`t connect to MySQL server on `10.0.0.237` (111)
Mon Jan 28 15:04:55 CST 2019
ERROR 2003 (HY000): Can`t connect to MySQL server on `10.0.0.237` (111)
Mon Jan 28 15:04:56 CST 2019
ERROR 2003 (HY000): Can`t connect to MySQL server on `10.0.0.237` (111)
Mon Jan 28 15:04:57 CST 2019
ERROR 2003 (HY000): Can`t connect to MySQL server on `10.0.0.237` (111)
Mon Jan 28 15:04:58 CST 2019
ERROR 2003 (HY000): Can`t connect to MySQL server on `10.0.0.237` (111)
Mon Jan 28 15:05:00 CST 2019
ERROR 2003 (HY000): Can`t connect to MySQL server on `10.0.0.237` (111)
Mon Jan 28 15:05:01 CST 2019
ERROR 2003 (HY000): Can`t connect to MySQL server on `10.0.0.237` (111)
Mon Jan 28 15:05:02 CST 2019
ERROR 2003 (HY000): Can`t connect to MySQL server on `10.0.0.237` (111)
Mon Jan 28 15:05:03 CST 2019

失效切換成功

5 總結

使用雙主 + keepalived 的優點是部署簡單,雙主加半同步情況下,理論上不會丟資料,適用於中小型 MySQL 叢集。缺點也比較明顯,就是增加從節點的情況下,從節點不會主動切換同步物件,而且指令碼需要自己實現,有一定風險。

相關文章