【DB寶19】在Docker中使用MySQL高可用之MHA

^_^小麥苗^_^發表於2021-02-19

原文網址 : https://www.cnblogs.com/lhrbest/p/14414442.html

一、MHA簡介和架構

1.1 MHA簡介

MHA（Master High Availability Manager and tools for MySQL）目前在MySQL高可用方面是一個相對成熟的解決方案，它是由日本人youshimaton採用Perl語言編寫的一個指令碼管理工具。MHA是一套優秀的作為MySQL高可用性環境下故障切換和主從提升的高可用軟體。MHA僅適用於MySQL Replication環境，目的在於維持Master主庫的高可用性。在MySQL故障切換過程中，MHA能做到0~30秒之內自動完成資料庫的故障切換操作，並且在進行故障切換的過程中，MHA能最大程度上保證資料庫的一致性，以達到真正意義上的高可用。

目前MHA主要支援一主多從的架構，要搭建MHA，要求一個複製叢集必須最少有3臺資料庫伺服器，一主二從，即一臺充當Master，一臺充當備用Master，另一臺充當從庫。

1.2 MHA工具包的組成

MHA由兩部分組成：MHA Manager（管理節點）和MHA Node（資料節點）。MHA Manager可以獨立部署在一臺獨立的機器上管理多個Master-Slave叢集，也可以部署在一臺Slave上。MHA Node執行在每臺MySQL伺服器上，MHA Manager會定時探測叢集中Master節點。當Master出現故障時，它可以自動將具有最新資料的Slave提升為新的Master，然後將所有其他的Slave重新指向新的Master。整個故障轉移過程對應用程式是完全透明的。MHA node執行在每臺MySQL伺服器上，它通過監控具備解析和清理logs功能的指令碼來加快故障轉移的。

Manager工具包情況如下：

masterha_check_ssh:檢查MHA的SSH配置情況。

masterha_check_repl:檢查MySQL複製狀況。

masterha_manager:啟動MHA。

masterha_check_status:檢測當前MHA執行狀態。

masterha_master_monitor:檢測Master是否當機。

masterha_master_switch:控制故障轉移（自動或手動）。

masterha_conf_host:新增或刪除配置的server資訊。

Node工具包（通常由MHA Manager的指令碼觸發，無需人工操作）情況如下：l

save_binary_logs:儲存和複製Master的binlog日誌。

apply_diff_relay_logs:識別差異的中級日誌時間並將其應用到其他Slave。

filter_mysqlbinlog:去除不必要的ROOLBACK事件（已經廢棄）

purge_relay_logs:清除中繼日誌（不阻塞SQL執行緒）

1.3 MHA架構

本文所使用的MHA架構規劃如下表：

IP	主機名	作用	Server ID	Port	型別	備註
192.168.68.131	MHA-LHR-Master1-ip131	master node	573306131	3306	寫入	對外提供寫服務
192.168.68.132	MHA-LHR-Slave1-ip132	slave node1 （Candicate Master）	573306132	讀	備選Master提供讀服務
192.168.68.133	MHA-LHR-Slave2-ip133	slave node2	573306133	讀	提供讀服務
192.168.68.134	MHA-LHR-Monitor-ip134	Monitor host				監控其它機器，一旦Mater當機，將會把備選Master提升為新的Master，而將Slave指向新的Master
192.168.68.135		VIP				在131和132之間進行浮動漂移
MySQL資料庫版本：MySQL 5.7.30，MySQL節點埠都是3306，各自的server_id不一樣

MHA切換前和切換後的架構圖：

二、準備MHA環境

2.1 下載MHA映象

小麥苗的Docker Hub的地址：https://hub.docker.com/u/lhrbest

# 下載映象
docker pull registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-master1-ip131
docker pull registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-slave1-ip132
docker pull registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-slave2-ip133
docker pull registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-monitor-ip134

# 重新命名映象
docker tag 	registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-master1-ip131  lhrbest/mha-lhr-master1-ip131
docker tag	registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-slave1-ip132   lhrbest/mha-lhr-slave1-ip132 
docker tag	registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-slave2-ip133   lhrbest/mha-lhr-slave2-ip133 
docker tag	registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-monitor-ip134  lhrbest/mha-lhr-monitor-ip134

一共4個映象，3個MHA Node，一個MHA Manager，壓縮包大概3G，下載完成後：

[root@lhrdocker ~]# docker images | grep mha
registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-monitor-ip134          latest              7d29597dc997        14 hours ago        1.53GB
registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-slave2-ip133           latest              d3717794e93a        40 hours ago        4.56GB
registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-slave1-ip132           latest              f62ee813e487        40 hours ago        4.56GB
registry.cn-hangzhou.aliyuncs.com/lhrbest/mha-lhr-master1-ip131          latest              ae7be48d83dc        40 hours ago        4.56GB

2.2 編輯yml檔案，建立MHA相關容器

編輯yml檔案，使用docker-compose來建立MHA相關容器，注意docker-compose.yml檔案的格式，對空格、縮排、對齊都有嚴格要求：

# 建立存放yml檔案的路徑
mkdir -p /root/mha

# 編輯檔案/root/mha/docker-compose.yml
cat > /root/mha/docker-compose.yml <<"EOF"
version: '3.8'

services:
  MHA-LHR-Master1-ip131:
    container_name: "MHA-LHR-Master1-ip131"
    restart: "always"
    hostname: MHA-LHR-Master1-ip131
    privileged: true
    image: lhrbest/mha-lhr-master1-ip131
    ports:
      - "33061:3306"
      - "2201:22"
    networks:
      mhalhr:
        ipv4_address: 192.168.68.131

  MHA-LHR-Slave1-ip132:
    container_name: "MHA-LHR-Slave1-ip132"
    restart: "always"
    hostname: MHA-LHR-Slave1-ip132
    privileged: true
    image: lhrbest/mha-lhr-slave1-ip132
    ports:
      - "33062:3306"
      - "2202:22"
    networks:
      mhalhr:
        ipv4_address: 192.168.68.132

  MHA-LHR-Slave2-ip133:
    container_name: "MHA-LHR-Slave2-ip133"
    restart: "always"
    hostname: MHA-LHR-Master1-ip131
    privileged: true
    image: lhrbest/mha-lhr-slave2-ip133
    ports:
      - "33063:3306"
      - "2203:22"
    networks:
      mhalhr:
        ipv4_address: 192.168.68.133

  MHA-LHR-Monitor-ip134:
    container_name: "MHA-LHR-Monitor-ip134"
    restart: "always"
    hostname: MHA-LHR-Monitor-ip134
    privileged: true
    image: lhrbest/mha-lhr-monitor-ip134
    ports:
      - "33064:3306"
      - "2204:22"
    networks:
      mhalhr:
        ipv4_address: 192.168.68.134

networks:
  mhalhr:
    name: mhalhr
    ipam:
      config:
         - subnet: "192.168.68.0/16"

EOF

2.3 安裝docker-compose軟體（若已安裝，可忽略）

安裝 Docker Compose官方文件：https://docs.docker.com/compose/
編輯docker-compose.yml檔案官方文件：https://docs.docker.com/compose/compose-file/

[root@lhrdocker ~]# curl --insecure -L https://github.com/docker/compose/releases/download/1.26.2/docker-compose-Linux-x86_64 -o /usr/local/bin/docker-compose
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   638  100   638    0     0    530      0  0:00:01  0:00:01 --:--:--   531
100 11.6M  100 11.6M    0     0  1994k      0  0:00:06  0:00:06 --:--:-- 2943k
[root@lhrdocker ~]# chmod +x /usr/local/bin/docker-compose
[root@lhrdocker ~]# docker-compose -v
docker-compose version 1.26.2, build eefe0d31

2.4 建立MHA容器

# 啟動mha環境的容器，一定要進入資料夾/root/mha/後再操作
[root@lhrdocker ~]# cd /root/mha/
[root@lhrdocker mha]#
[root@lhrdocker mha]# docker-compose up -d
Creating network "mhalhr" with the default driver
Creating MHA-LHR-Monitor-ip134 ... done
Creating MHA-LHR-Slave2-ip133  ... done
Creating MHA-LHR-Master1-ip131 ... done
Creating MHA-LHR-Slave1-ip132  ... done
[root@lhrdocker mha]# docker ps
CONTAINER ID        IMAGE                           COMMAND             CREATED             STATUS              PORTS                                                            NAMES
d5b1af2ca979        lhrbest/mha-lhr-slave1-ip132    "/usr/sbin/init"    12 seconds ago      Up 9 seconds        16500-16599/tcp, 0.0.0.0:2202->22/tcp, 0.0.0.0:33062->3306/tcp   MHA-LHR-Slave1-ip132
8fa79f476aaa        lhrbest/mha-lhr-master1-ip131   "/usr/sbin/init"    12 seconds ago      Up 10 seconds       16500-16599/tcp, 0.0.0.0:2201->22/tcp, 0.0.0.0:33061->3306/tcp   MHA-LHR-Master1-ip131
74407b9df567        lhrbest/mha-lhr-slave2-ip133    "/usr/sbin/init"    12 seconds ago      Up 10 seconds       16500-16599/tcp, 0.0.0.0:2203->22/tcp, 0.0.0.0:33063->3306/tcp   MHA-LHR-Slave2-ip133
83f1cab03c9b        lhrbest/mha-lhr-monitor-ip134   "/usr/sbin/init"    12 seconds ago      Up 10 seconds       0.0.0.0:2204->22/tcp, 0.0.0.0:33064->3306/tcp                    MHA-LHR-Monitor-ip134
[root@lhrdocker mha]#

2.5 初始化MHA環境

2.5.1 新增網路卡

# 給MHA加入預設的網路卡
docker network connect bridge MHA-LHR-Master1-ip131
docker network connect bridge MHA-LHR-Slave1-ip132
docker network connect bridge MHA-LHR-Slave2-ip133
docker network connect bridge MHA-LHR-Monitor-ip134

注意：請確保這4個節點的eth0都是192.168.68.0網段，否則後續的MHA切換可能會出問題。如果不一致，那麼可以使用如下命令修改：
# 刪除網路卡
docker network disconnect bridge MHA-LHR-Master1-ip131
docker network disconnect mhalhr MHA-LHR-Master1-ip131

# 重啟容器
docker restart MHA-LHR-Master1-ip131

# 新增網路卡
docker network connect mhalhr MHA-LHR-Master1-ip131 --ip 192.168.68.131
docker network connect bridge MHA-LHR-Master1-ip131

2.5.2 修改Manager節點的hosts檔案

# 進入管理節點134
docker exec -it MHA-LHR-Monitor-ip134 bash

# 修改/etc/hosts檔案
cat >> /etc/hosts << EOF
192.168.68.131  MHA-LHR-Master1-ip131
192.168.68.132  MHA-LHR-Slave1-ip132
192.168.68.133  MHA-LHR-Slave2-ip133
192.168.68.134  MHA-LHR-Monitor-ip134
EOF

2.5.3 主庫131新增VIP

# 進入主庫131
docker exec -it MHA-LHR-Master1-ip131 bash

# 新增VIP135
/sbin/ifconfig eth0:1 192.168.68.135/24
ifconfig

新增完成後：

[root@MHA-LHR-Master1-ip131 /]# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.68.131  netmask 255.255.0.0  broadcast 192.168.255.255
        ether 02:42:c0:a8:44:83  txqueuelen 0  (Ethernet)
        RX packets 220  bytes 15883 (15.5 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 189  bytes 17524 (17.1 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.68.135  netmask 255.255.255.0  broadcast 192.168.68.255
        ether 02:42:c0:a8:44:83  txqueuelen 0  (Ethernet)

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.17.0.2  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:ac:11:00:02  txqueuelen 0  (Ethernet)
        RX packets 31  bytes 2697 (2.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 14  bytes 3317 (3.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 5  bytes 400 (400.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 5  bytes 400 (400.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        
# 管理節點已經可以ping通VIP了
[root@MHA-LHR-Monitor-ip134 /]# ping 192.168.68.135
PING 192.168.68.135 (192.168.68.135) 56(84) bytes of data.
64 bytes from 192.168.68.135: icmp_seq=1 ttl=64 time=0.172 ms
64 bytes from 192.168.68.135: icmp_seq=2 ttl=64 time=0.076 ms
^C
--- 192.168.68.135 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.076/0.124/0.172/0.048 ms

2.5.4 分別進入132和133啟動複製程式

-- 132節點 
mysql -h192.168.59.220 -uroot -plhr -P33062
reset slave;
start slave;
show slave status \G

-- 133節點
mysql -h192.168.59.220 -uroot -plhr -P33063
reset slave;
start slave;
show slave status \G

結果：

C:\Users\lhrxxt>mysql -h192.168.59.220 -uroot -plhr -P33062
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.30-log MySQL Community Server (GPL)

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MySQL [(none)]> start slave;
ERROR 1872 (HY000): Slave failed to initialize relay log info structure from the repository

MySQL [(none)]> reset slave;
Query OK, 0 rows affected (0.02 sec)

MySQL [(none)]> start slave;
Query OK, 0 rows affected (0.01 sec)

MySQL [(none)]> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.68.131
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: MHA-LHR-Master1-ip131-bin.000011
          Read_Master_Log_Pos: 234
               Relay_Log_File: MHA-LHR-Master1-ip131-relay-bin.000003
                Relay_Log_Pos: 399
        Relay_Master_Log_File: MHA-LHR-Master1-ip131-bin.000011
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB: information_schema,performance_schema,mysql,sys
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 234
              Relay_Log_Space: 799
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 573306131
                  Master_UUID: c8ca4f1d-aec3-11ea-942b-0242c0a84483
             Master_Info_File: /usr/local/mysql-5.7.30-linux-glibc2.12-x86_64/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set: c8ca4f1d-aec3-11ea-942b-0242c0a84483:1-11,
d24a77d1-aec3-11ea-9399-0242c0a84484:1-3
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

至此，我們就把MHA環境準備好了，接下來就開始測試MHA的各項功能。

三、測試MHA相關功能

在正式測試之前，我們要保證MHA環境已經配置正確，且MHA管理程式已經啟動。

3.1 檢查MHA環境的配置

在Manager節點檢查SSH、複製及MHA的狀態。

3.1.1 檢查SSH情況：

[root@MHA-LHR-Monitor-ip134 /]# masterha_check_ssh --conf=/etc/mha/mha.cnf
Sat Aug  8 09:57:42 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Aug  8 09:57:42 2020 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Sat Aug  8 09:57:42 2020 - [info] Reading server configuration from /etc/mha/mha.cnf..
Sat Aug  8 09:57:42 2020 - [info] Starting SSH connection tests..
Sat Aug  8 09:57:43 2020 - [debug] 
Sat Aug  8 09:57:42 2020 - [debug]  Connecting via SSH from root@192.168.68.131(192.168.68.131:22) to root@192.168.68.132(192.168.68.132:22)..
Sat Aug  8 09:57:42 2020 - [debug]   ok.
Sat Aug  8 09:57:42 2020 - [debug]  Connecting via SSH from root@192.168.68.131(192.168.68.131:22) to root@192.168.68.133(192.168.68.133:22)..
Sat Aug  8 09:57:42 2020 - [debug]   ok.
Sat Aug  8 09:57:43 2020 - [debug] 
Sat Aug  8 09:57:42 2020 - [debug]  Connecting via SSH from root@192.168.68.132(192.168.68.132:22) to root@192.168.68.131(192.168.68.131:22)..
Sat Aug  8 09:57:42 2020 - [debug]   ok.
Sat Aug  8 09:57:42 2020 - [debug]  Connecting via SSH from root@192.168.68.132(192.168.68.132:22) to root@192.168.68.133(192.168.68.133:22)..
Sat Aug  8 09:57:43 2020 - [debug]   ok.
Sat Aug  8 09:57:44 2020 - [debug] 
Sat Aug  8 09:57:43 2020 - [debug]  Connecting via SSH from root@192.168.68.133(192.168.68.133:22) to root@192.168.68.131(192.168.68.131:22)..
Sat Aug  8 09:57:43 2020 - [debug]   ok.
Sat Aug  8 09:57:43 2020 - [debug]  Connecting via SSH from root@192.168.68.133(192.168.68.133:22) to root@192.168.68.132(192.168.68.132:22)..
Sat Aug  8 09:57:43 2020 - [debug]   ok.
Sat Aug  8 09:57:44 2020 - [info] All SSH connection tests passed successfully.

結果“All SSH connection tests passed successfully.”表示MHA的3個資料節點之間的SSH是正常的。

3.1.2 檢查複製情況：

[root@MHA-LHR-Monitor-ip134 /]# masterha_check_repl --conf=/etc/mha/mha.cnf
Sat Aug  8 09:59:31 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Aug  8 09:59:31 2020 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Sat Aug  8 09:59:31 2020 - [info] Reading server configuration from /etc/mha/mha.cnf..
Sat Aug  8 09:59:31 2020 - [info] MHA::MasterMonitor version 0.58.
Sat Aug  8 09:59:33 2020 - [info] GTID failover mode = 1
Sat Aug  8 09:59:33 2020 - [info] Dead Servers:
Sat Aug  8 09:59:33 2020 - [info] Alive Servers:
Sat Aug  8 09:59:33 2020 - [info]   192.168.68.131(192.168.68.131:3306)
Sat Aug  8 09:59:33 2020 - [info]   192.168.68.132(192.168.68.132:3306)
Sat Aug  8 09:59:33 2020 - [info]   192.168.68.133(192.168.68.133:3306)
Sat Aug  8 09:59:33 2020 - [info] Alive Slaves:
Sat Aug  8 09:59:33 2020 - [info]   192.168.68.132(192.168.68.132:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 09:59:33 2020 - [info]     GTID ON
Sat Aug  8 09:59:33 2020 - [info]     Replicating from 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 09:59:33 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Aug  8 09:59:33 2020 - [info]   192.168.68.133(192.168.68.133:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 09:59:33 2020 - [info]     GTID ON
Sat Aug  8 09:59:33 2020 - [info]     Replicating from 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 09:59:33 2020 - [info] Current Alive Master: 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 09:59:33 2020 - [info] Checking slave configurations..
Sat Aug  8 09:59:33 2020 - [info]  read_only=1 is not set on slave 192.168.68.132(192.168.68.132:3306).
Sat Aug  8 09:59:33 2020 - [info]  read_only=1 is not set on slave 192.168.68.133(192.168.68.133:3306).
Sat Aug  8 09:59:33 2020 - [info] Checking replication filtering settings..
Sat Aug  8 09:59:33 2020 - [info]  binlog_do_db= , binlog_ignore_db= information_schema,mysql,performance_schema,sys
Sat Aug  8 09:59:33 2020 - [info]  Replication filtering check ok.
Sat Aug  8 09:59:33 2020 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Sat Aug  8 09:59:33 2020 - [info] Checking SSH publickey authentication settings on the current master..
Sat Aug  8 09:59:33 2020 - [info] HealthCheck: SSH to 192.168.68.131 is reachable.
Sat Aug  8 09:59:33 2020 - [info] 
192.168.68.131(192.168.68.131:3306) (current master)
 +--192.168.68.132(192.168.68.132:3306)
 +--192.168.68.133(192.168.68.133:3306)

Sat Aug  8 09:59:33 2020 - [info] Checking replication health on 192.168.68.132..
Sat Aug  8 09:59:33 2020 - [info]  ok.
Sat Aug  8 09:59:33 2020 - [info] Checking replication health on 192.168.68.133..
Sat Aug  8 09:59:33 2020 - [info]  ok.
Sat Aug  8 09:59:33 2020 - [info] Checking master_ip_failover_script status:
Sat Aug  8 09:59:33 2020 - [info]   /usr/local/mha/scripts/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.68.131 --orig_master_ip=192.168.68.131 --orig_master_port=3306 


IN SCRIPT TEST====/sbin/ip addr del 192.168.68.135/24 dev eth0==/sbin/ifconfig eth0:1 192.168.68.135/24===

Checking the Status of the script.. OK 
Sat Aug  8 09:59:33 2020 - [info]  OK.
Sat Aug  8 09:59:33 2020 - [warning] shutdown_script is not defined.
Sat Aug  8 09:59:33 2020 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

“MySQL Replication Health is OK.”表示1主2從的架構目前是正常的。

3.1.3 檢查MHA狀態：

[root@MHA-LHR-Monitor-ip134 /]# masterha_check_status --conf=/etc/mha/mha.cnf
mha is stopped(2:NOT_RUNNING).

注意：如果正常，會顯示“PING_OK"，否則會顯示“NOT_RUNNING"，這代表MHA監控沒有開啟。

3.1.4 啟動MHA Manager

[root@MHA-LHR-Monitor-ip134 /]# nohup masterha_manager --conf=/etc/mha/mha.cnf  --ignore_last_failover < /dev/null > /usr/local/mha/manager_start.log 2>&1 &
[1] 216
[root@MHA-LHR-Monitor-ip134 /]# masterha_check_status --conf=/etc/mha/mha.cnf                                                                               
mha (pid:216) is running(0:PING_OK), master:192.168.68.131

檢查結果顯示“PING_OK”，表示MHA監控軟體已經啟動了，主庫為192.168.68.131。

啟動引數介紹：

--remove_dead_master_conf 該引數代表當發生主從切換後，老的主庫的IP將會從配置檔案中移除。

--manger_log 日誌存放位置

--ignore_last_failover 在預設情況下，如果MHA檢測到連續發生當機，且兩次當機間隔不足8小時的話，則不會進行Failover，之所以這樣限制是為了避免ping-pong效應。該引數代表忽略上次MHA觸發切換產生的檔案，預設情況下，MHA發生切換後會在日誌目錄下產生mha.failover.complete檔案，下次再次切換的時候如果發現該目錄下存在該檔案將不允許觸發切換，除非在第一次切換後收到刪除該檔案，為了方便，這裡設定為--ignore_last_failover。

注意，一旦自動failover發生，mha manager就停止監控了，如果需要請手動再次開啟。

3.1.5 關閉MHA-manager

masterha_stop --conf=/etc/mha/mha.cnf

我們當然不關閉，不能執行這句喲。

3.2 測試場景一：自動故障轉移+郵件告警

自動故障轉移後的架構如下圖所示：

按照以下流程測試：

3.2.1 啟動客戶端連線到VIP135，後端其實是連線到主庫131

[root@lhrdocker ~]# mysql -uroot -plhr -h192.168.68.135 -P3306
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 10
Server version: 5.7.30-log MySQL Community Server (GPL)

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select @@hostname;
+-----------------------+
| @@hostname            |
+-----------------------+
| MHA-LHR-Master1-ip131 |
+-----------------------+
1 row in set (0.00 sec)

mysql> show slave hosts;
+-----------+----------------+------+-----------+--------------------------------------+
| Server_id | Host           | Port | Master_id | Slave_UUID                           |
+-----------+----------------+------+-----------+--------------------------------------+
| 573306133 | 192.168.68.133 | 3306 | 573306131 | d391ce7e-aec3-11ea-94cd-0242c0a84485 |
| 573306132 | 192.168.68.132 | 3306 | 573306131 | d24a77d1-aec3-11ea-9399-0242c0a84484 |
+-----------+----------------+------+-----------+--------------------------------------+
2 rows in set (0.00 sec)

3.2.2 模擬主庫131當機，即停止MySQL服務

docker stop MHA-LHR-Master1-ip131

3.2.3 觀察如下現象：

①　VIP135自動漂移到132

[root@MHA-LHR-Slave1-ip132 /]# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.68.132  netmask 255.255.0.0  broadcast 192.168.255.255
        ether 02:42:c0:a8:44:84  txqueuelen 0  (Ethernet)
        RX packets 411  bytes 58030 (56.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 343  bytes 108902 (106.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.68.135  netmask 255.255.255.0  broadcast 192.168.68.255
        ether 02:42:c0:a8:44:84  txqueuelen 0  (Ethernet)

②　主庫自動變為132，命令為：show slave hosts;

mysql> select @@hostname;
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql> select @@hostname;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    15
Current database: *** NONE ***

+----------------------+
| @@hostname           |
+----------------------+
| MHA-LHR-Slave1-ip132 |
+----------------------+
1 row in set (0.00 sec)

mysql> show slave hosts;
+-----------+----------------+------+-----------+--------------------------------------+
| Server_id | Host           | Port | Master_id | Slave_UUID                           |
+-----------+----------------+------+-----------+--------------------------------------+
| 573306133 | 192.168.68.133 | 3306 | 573306132 | d391ce7e-aec3-11ea-94cd-0242c0a84485 |
+-----------+----------------+------+-----------+--------------------------------------+
1 row in set (0.00 sec)

③　MHA程式自動停止

[1]+  Done                    nohup masterha_manager --conf=/etc/mha/mha.cnf --ignore_last_failover < /dev/null > /usr/local/mha/manager_start.log 2>&1
[root@MHA-LHR-Monitor-ip134 /]# 
[root@MHA-LHR-Monitor-ip134 /]# 
[root@MHA-LHR-Monitor-ip134 /]# ps -ef|grep mha
root        486    120  0 11:03 pts/0    00:00:00 grep --color=auto mha

④ MHA切換過程日誌：

[root@MHA-LHR-Monitor-ip134 /]# tailf /usr/local/mha/manager_running.log

Sat Aug  8 11:01:23 2020 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query)
Sat Aug  8 11:01:23 2020 - [info] Executing secondary network check script: /usr/local/bin/masterha_secondary_check -s MHA-LHR-Slave1-ip132 -s MHA-LHR-Slave2-ip133 --user=root --master_host=MHA-LHR-Master1-ip131 --master_ip=192.168.68.131 --master_port=3306  --user=root  --master_host=192.168.68.131  --master_ip=192.168.68.131  --master_port=3306 --master_user=mha --master_password=lhr --ping_type=SELECT
Sat Aug  8 11:01:23 2020 - [info] Executing SSH check script: exit 0
Sat Aug  8 11:01:23 2020 - [warning] HealthCheck: SSH to 192.168.68.131 is NOT reachable.
Monitoring server MHA-LHR-Slave1-ip132 is reachable, Master is not reachable from MHA-LHR-Slave1-ip132. OK.
Monitoring server MHA-LHR-Slave2-ip133 is reachable, Master is not reachable from MHA-LHR-Slave2-ip133. OK.
Sat Aug  8 11:01:23 2020 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Sat Aug  8 11:01:24 2020 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.68.131' (4))
Sat Aug  8 11:01:24 2020 - [warning] Connection failed 2 time(s)..
Sat Aug  8 11:01:25 2020 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.68.131' (4))
Sat Aug  8 11:01:25 2020 - [warning] Connection failed 3 time(s)..
Sat Aug  8 11:01:26 2020 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.68.131' (4))
Sat Aug  8 11:01:26 2020 - [warning] Connection failed 4 time(s)..
Sat Aug  8 11:01:26 2020 - [warning] Master is not reachable from health checker!
Sat Aug  8 11:01:26 2020 - [warning] Master 192.168.68.131(192.168.68.131:3306) is not reachable!
Sat Aug  8 11:01:26 2020 - [warning] SSH is NOT reachable.
Sat Aug  8 11:01:26 2020 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/mha.cnf again, and trying to connect to all servers to check server status..
Sat Aug  8 11:01:26 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Aug  8 11:01:26 2020 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Sat Aug  8 11:01:26 2020 - [info] Reading server configuration from /etc/mha/mha.cnf..
Sat Aug  8 11:01:27 2020 - [info] GTID failover mode = 1
Sat Aug  8 11:01:27 2020 - [info] Dead Servers:
Sat Aug  8 11:01:27 2020 - [info]   192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:01:27 2020 - [info] Alive Servers:
Sat Aug  8 11:01:27 2020 - [info]   192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:01:27 2020 - [info]   192.168.68.133(192.168.68.133:3306)
Sat Aug  8 11:01:27 2020 - [info] Alive Slaves:
Sat Aug  8 11:01:27 2020 - [info]   192.168.68.132(192.168.68.132:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:01:27 2020 - [info]     GTID ON
Sat Aug  8 11:01:27 2020 - [info]     Replicating from 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:01:27 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Aug  8 11:01:27 2020 - [info]   192.168.68.133(192.168.68.133:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:01:27 2020 - [info]     GTID ON
Sat Aug  8 11:01:27 2020 - [info]     Replicating from 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:01:27 2020 - [info] Checking slave configurations..
Sat Aug  8 11:01:27 2020 - [info]  read_only=1 is not set on slave 192.168.68.132(192.168.68.132:3306).
Sat Aug  8 11:01:27 2020 - [info]  read_only=1 is not set on slave 192.168.68.133(192.168.68.133:3306).
Sat Aug  8 11:01:27 2020 - [info] Checking replication filtering settings..
Sat Aug  8 11:01:27 2020 - [info]  Replication filtering check ok.
Sat Aug  8 11:01:27 2020 - [info] Master is down!
Sat Aug  8 11:01:27 2020 - [info] Terminating monitoring script.
Sat Aug  8 11:01:27 2020 - [info] Got exit code 20 (Master dead).
Sat Aug  8 11:01:27 2020 - [info] MHA::MasterFailover version 0.58.
Sat Aug  8 11:01:27 2020 - [info] Starting master failover.
Sat Aug  8 11:01:27 2020 - [info] 
Sat Aug  8 11:01:27 2020 - [info] * Phase 1: Configuration Check Phase..
Sat Aug  8 11:01:27 2020 - [info] 
Sat Aug  8 11:01:29 2020 - [info] GTID failover mode = 1
Sat Aug  8 11:01:29 2020 - [info] Dead Servers:
Sat Aug  8 11:01:29 2020 - [info]   192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:01:29 2020 - [info] Checking master reachability via MySQL(double check)...
Sat Aug  8 11:01:30 2020 - [info]  ok.
Sat Aug  8 11:01:30 2020 - [info] Alive Servers:
Sat Aug  8 11:01:30 2020 - [info]   192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:01:30 2020 - [info]   192.168.68.133(192.168.68.133:3306)
Sat Aug  8 11:01:30 2020 - [info] Alive Slaves:
Sat Aug  8 11:01:30 2020 - [info]   192.168.68.132(192.168.68.132:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:01:30 2020 - [info]     GTID ON
Sat Aug  8 11:01:30 2020 - [info]     Replicating from 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:01:30 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Aug  8 11:01:30 2020 - [info]   192.168.68.133(192.168.68.133:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:01:30 2020 - [info]     GTID ON
Sat Aug  8 11:01:30 2020 - [info]     Replicating from 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:01:30 2020 - [info] Starting GTID based failover.
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] ** Phase 1: Configuration Check Phase completed.
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] * Phase 2: Dead Master Shutdown Phase..
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] Forcing shutdown so that applications never connect to the current master..
Sat Aug  8 11:01:30 2020 - [info] Executing master IP deactivation script:
Sat Aug  8 11:01:30 2020 - [info]   /usr/local/mha/scripts/master_ip_failover --orig_master_host=192.168.68.131 --orig_master_ip=192.168.68.131 --orig_master_port=3306 --command=stop 


IN SCRIPT TEST====/sbin/ip addr del 192.168.68.135/24 dev eth0==/sbin/ifconfig eth0:1 192.168.68.135/24===

Disabling the VIP on old master: 192.168.68.131 
Sat Aug  8 11:01:30 2020 - [info]  done.
Sat Aug  8 11:01:30 2020 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Sat Aug  8 11:01:30 2020 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] * Phase 3: Master Recovery Phase..
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] The latest binary log file/position on all slaves is MHA-LHR-Master1-ip131-bin.000011:234
Sat Aug  8 11:01:30 2020 - [info] Latest slaves (Slaves that received relay log files to the latest):
Sat Aug  8 11:01:30 2020 - [info]   192.168.68.132(192.168.68.132:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:01:30 2020 - [info]     GTID ON
Sat Aug  8 11:01:30 2020 - [info]     Replicating from 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:01:30 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Aug  8 11:01:30 2020 - [info]   192.168.68.133(192.168.68.133:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:01:30 2020 - [info]     GTID ON
Sat Aug  8 11:01:30 2020 - [info]     Replicating from 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:01:30 2020 - [info] The oldest binary log file/position on all slaves is MHA-LHR-Master1-ip131-bin.000011:234
Sat Aug  8 11:01:30 2020 - [info] Oldest slaves:
Sat Aug  8 11:01:30 2020 - [info]   192.168.68.132(192.168.68.132:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:01:30 2020 - [info]     GTID ON
Sat Aug  8 11:01:30 2020 - [info]     Replicating from 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:01:30 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Aug  8 11:01:30 2020 - [info]   192.168.68.133(192.168.68.133:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:01:30 2020 - [info]     GTID ON
Sat Aug  8 11:01:30 2020 - [info]     Replicating from 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] * Phase 3.3: Determining New Master Phase..
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] Searching new master from slaves..
Sat Aug  8 11:01:30 2020 - [info]  Candidate masters from the configuration file:
Sat Aug  8 11:01:30 2020 - [info]   192.168.68.132(192.168.68.132:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:01:30 2020 - [info]     GTID ON
Sat Aug  8 11:01:30 2020 - [info]     Replicating from 192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:01:30 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Aug  8 11:01:30 2020 - [info]  Non-candidate masters:
Sat Aug  8 11:01:30 2020 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Sat Aug  8 11:01:30 2020 - [info] New master is 192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:01:30 2020 - [info] Starting master failover..
Sat Aug  8 11:01:30 2020 - [info] 
From:
192.168.68.131(192.168.68.131:3306) (current master)
 +--192.168.68.132(192.168.68.132:3306)
 +--192.168.68.133(192.168.68.133:3306)

To:
192.168.68.132(192.168.68.132:3306) (new master)
 +--192.168.68.133(192.168.68.133:3306)
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] * Phase 3.3: New Master Recovery Phase..
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info]  Waiting all logs to be applied.. 
Sat Aug  8 11:01:30 2020 - [info]   done.
Sat Aug  8 11:01:30 2020 - [info] Getting new master's binlog name and position..
Sat Aug  8 11:01:30 2020 - [info]  MHA-LHR-Slave1-ip132-bin.000008:234
Sat Aug  8 11:01:30 2020 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.68.132', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Sat Aug  8 11:01:30 2020 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: MHA-LHR-Slave1-ip132-bin.000008, 234, c8ca4f1d-aec3-11ea-942b-0242c0a84483:1-11,
d24a77d1-aec3-11ea-9399-0242c0a84484:1-3
Sat Aug  8 11:01:30 2020 - [info] Executing master IP activate script:
Sat Aug  8 11:01:30 2020 - [info]   /usr/local/mha/scripts/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.68.131 --orig_master_ip=192.168.68.131 --orig_master_port=3306 --new_master_host=192.168.68.132 --new_master_ip=192.168.68.132 --new_master_port=3306 --new_master_user='mha'   --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_password


IN SCRIPT TEST====/sbin/ip addr del 192.168.68.135/24 dev eth0==/sbin/ifconfig eth0:1 192.168.68.135/24===

Enabling the VIP - 192.168.68.135/24 on the new master - 192.168.68.132 
Sat Aug  8 11:01:30 2020 - [info]  OK.
Sat Aug  8 11:01:30 2020 - [info] ** Finished master recovery successfully.
Sat Aug  8 11:01:30 2020 - [info] * Phase 3: Master Recovery Phase completed.
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] * Phase 4: Slaves Recovery Phase..
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] * Phase 4.1: Starting Slaves in parallel..
Sat Aug  8 11:01:30 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info] -- Slave recovery on host 192.168.68.133(192.168.68.133:3306) started, pid: 474. Check tmp log /usr/local/mha/192.168.68.133_3306_20200808110127.log if it takes time..
Sat Aug  8 11:01:32 2020 - [info] 
Sat Aug  8 11:01:32 2020 - [info] Log messages from 192.168.68.133 ...
Sat Aug  8 11:01:32 2020 - [info] 
Sat Aug  8 11:01:30 2020 - [info]  Resetting slave 192.168.68.133(192.168.68.133:3306) and starting replication from the new master 192.168.68.132(192.168.68.132:3306)..
Sat Aug  8 11:01:30 2020 - [info]  Executed CHANGE MASTER.
Sat Aug  8 11:01:31 2020 - [info]  Slave started.
Sat Aug  8 11:01:31 2020 - [info]  gtid_wait(c8ca4f1d-aec3-11ea-942b-0242c0a84483:1-11,
d24a77d1-aec3-11ea-9399-0242c0a84484:1-3) completed on 192.168.68.133(192.168.68.133:3306). Executed 0 events.
Sat Aug  8 11:01:32 2020 - [info] End of log messages from 192.168.68.133.
Sat Aug  8 11:01:32 2020 - [info] -- Slave on host 192.168.68.133(192.168.68.133:3306) started.
Sat Aug  8 11:01:32 2020 - [info] All new slave servers recovered successfully.
Sat Aug  8 11:01:32 2020 - [info] 
Sat Aug  8 11:01:32 2020 - [info] * Phase 5: New master cleanup phase..
Sat Aug  8 11:01:32 2020 - [info] 
Sat Aug  8 11:01:32 2020 - [info] Resetting slave info on the new master..
Sat Aug  8 11:01:32 2020 - [info]  192.168.68.132: Resetting slave info succeeded.
Sat Aug  8 11:01:32 2020 - [info] Master failover to 192.168.68.132(192.168.68.132:3306) completed successfully.
Sat Aug  8 11:01:32 2020 - [info] 

----- Failover Report -----

mha: MySQL Master failover 192.168.68.131(192.168.68.131:3306) to 192.168.68.132(192.168.68.132:3306) succeeded

Master 192.168.68.131(192.168.68.131:3306) is down!

Check MHA Manager logs at MHA-LHR-Monitor-ip134:/usr/local/mha/manager_running.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.68.131(192.168.68.131:3306)
Selected 192.168.68.132(192.168.68.132:3306) as a new master.
192.168.68.132(192.168.68.132:3306): OK: Applying all logs succeeded.
192.168.68.132(192.168.68.132:3306): OK: Activated master IP address.
192.168.68.133(192.168.68.133:3306): OK: Slave started, replicating from 192.168.68.132(192.168.68.132:3306)
192.168.68.132(192.168.68.132:3306): Resetting slave info succeeded.
Master failover to 192.168.68.132(192.168.68.132:3306) completed successfully.
Sat Aug  8 11:01:32 2020 - [info] Sending mail..

⑤ 同時，郵件收到告警

注意：

1、首先確保你的134環境可以上外網

[root@MHA-LHR-Monitor-ip134 /]# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=127 time=109 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=127 time=152 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=127 time=132 ms
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 109.295/131.671/152.728/17.758 ms
[root@MHA-LHR-Monitor-ip134 /]#

2、如果你想修改郵件的收件人，那麼需要修改134管理節點的/usr/local/bin/send_report檔案，將其中的lhrbest@qq.com修改為收件人的郵箱地址即可。檔案/usr/local/bin/send_report內容如下所示：

[root@MHA-LHR-Monitor-ip134 /]# cat /usr/local/bin/send_report
#!/bin/bash
start_num=+`awk '/Failover Report/{print NR}' /usr/local/mha/manager_running.log | tail -n 1`
tail -n $start_num /usr/local/mha/manager_running.log | mail  -s  '【嚴重告警】'管理節點`hostname -s`'的MHA架構發生了自動切換' -a '/usr/local/mha/manager_running.log' lhrbest@qq.com
[root@MHA-LHR-Monitor-ip134 /]#

3.2.4 啟動131，恢復131為備庫

# 啟動131
docker start MHA-LHR-Master1-ip131

# 在134的日誌檔案中找到恢復的語句
grep "All other slaves should start replication from here" /usr/local/mha/manager_running.log

# 在131上執行恢復
CHANGE MASTER TO MASTER_HOST='192.168.68.132', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='lhr';

start slave;
show slave status;

# 在134上檢查
masterha_check_repl --conf=/etc/mha/mha.cnf

執行過程：

[root@MHA-LHR-Monitor-ip134 /]# grep "All other slaves should start replication from here" /usr/local/mha/manager_running.log
Mon Jun 15 14:16:31 2020 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.68.132', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Sat Aug  8 11:01:30 2020 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.68.132', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
[root@MHA-LHR-Monitor-ip134 /]# 
[root@MHA-LHR-Monitor-ip134 /]# masterha_check_repl --conf=/etc/mha/mha.cnf
Sat Aug  8 11:23:30 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Aug  8 11:23:30 2020 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Sat Aug  8 11:23:30 2020 - [info] Reading server configuration from /etc/mha/mha.cnf..
Sat Aug  8 11:23:30 2020 - [info] MHA::MasterMonitor version 0.58.
Sat Aug  8 11:23:32 2020 - [info] GTID failover mode = 1
Sat Aug  8 11:23:32 2020 - [info] Dead Servers:
Sat Aug  8 11:23:32 2020 - [info] Alive Servers:
Sat Aug  8 11:23:32 2020 - [info]   192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:23:32 2020 - [info]   192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:23:32 2020 - [info]   192.168.68.133(192.168.68.133:3306)
Sat Aug  8 11:23:32 2020 - [info] Alive Slaves:
Sat Aug  8 11:23:32 2020 - [info]   192.168.68.131(192.168.68.131:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:23:32 2020 - [info]     GTID ON
Sat Aug  8 11:23:32 2020 - [info]     Replicating from 192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:23:32 2020 - [info]   192.168.68.133(192.168.68.133:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:23:32 2020 - [info]     GTID ON
Sat Aug  8 11:23:32 2020 - [info]     Replicating from 192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:23:32 2020 - [info] Current Alive Master: 192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:23:32 2020 - [info] Checking slave configurations..
Sat Aug  8 11:23:32 2020 - [info]  read_only=1 is not set on slave 192.168.68.131(192.168.68.131:3306).
Sat Aug  8 11:23:32 2020 - [info]  read_only=1 is not set on slave 192.168.68.133(192.168.68.133:3306).
Sat Aug  8 11:23:32 2020 - [info] Checking replication filtering settings..
Sat Aug  8 11:23:32 2020 - [info]  binlog_do_db= , binlog_ignore_db= information_schema,mysql,performance_schema,sys
Sat Aug  8 11:23:32 2020 - [info]  Replication filtering check ok.
Sat Aug  8 11:23:32 2020 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Sat Aug  8 11:23:32 2020 - [info] Checking SSH publickey authentication settings on the current master..
Sat Aug  8 11:23:32 2020 - [info] HealthCheck: SSH to 192.168.68.132 is reachable.
Sat Aug  8 11:23:32 2020 - [info] 
192.168.68.132(192.168.68.132:3306) (current master)
 +--192.168.68.131(192.168.68.131:3306)
 +--192.168.68.133(192.168.68.133:3306)

Sat Aug  8 11:23:32 2020 - [info] Checking replication health on 192.168.68.131..
Sat Aug  8 11:23:32 2020 - [info]  ok.
Sat Aug  8 11:23:32 2020 - [info] Checking replication health on 192.168.68.133..
Sat Aug  8 11:23:32 2020 - [info]  ok.
Sat Aug  8 11:23:32 2020 - [info] Checking master_ip_failover_script status:
Sat Aug  8 11:23:32 2020 - [info]   /usr/local/mha/scripts/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.68.132 --orig_master_ip=192.168.68.132 --orig_master_port=3306 


IN SCRIPT TEST====/sbin/ip addr del 192.168.68.135/24 dev eth0==/sbin/ifconfig eth0:1 192.168.68.135/24===

Checking the Status of the script.. OK 
Sat Aug  8 11:23:32 2020 - [info]  OK.
Sat Aug  8 11:23:32 2020 - [warning] shutdown_script is not defined.
Sat Aug  8 11:23:32 2020 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

3.2.5 Switchover切換：手動切換131為主庫，132為備庫

類似Oracle DG中的switchover。在該場景下，主庫並沒有當機。在主庫活著的時候，將主庫降級為備庫，將備用主庫提升為主庫，並且重新配置主從關係。此時，MHA程式不能啟動。

masterha_master_switch --conf=/etc/mha/mha.cnf  --master_state=alive \
--orig_master_is_new_slave --running_updates_limit=10000 --interactive=0 \
--new_master_host=192.168.68.131 --new_master_port=3306

引數解釋：

--interactive 為是否互動，即你要輸入yes或no

--running_updates_limit 如果在切換過程中不指定running_updates_limit，那麼預設情況下running_updates_limit為1秒。故障切換時，候選master如果有延遲的話，mha切換不能成功，加上此參數列示延遲在此時間範圍內都可切換（單位為s），但是切換的時間長短是由recover時relay日誌的大小決定

--orig_master_is_new_slave 將原來的主降低為從並重新加入主從關係

--new_master_host 指定新的主庫的主機名，建議寫IP地址

--new_master_port 指定新的主庫上mysql服務的埠

在切換完成後，主庫為131，備庫為132和133，VIP自動切換到131，即回到了最初的MHA狀態。

[root@MHA-LHR-Monitor-ip134 /]# masterha_check_repl --conf=/etc/mha/mha.cnf
Sat Aug  8 11:23:30 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Aug  8 11:23:30 2020 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Sat Aug  8 11:23:30 2020 - [info] Reading server configuration from /etc/mha/mha.cnf..
Sat Aug  8 11:23:30 2020 - [info] MHA::MasterMonitor version 0.58.
Sat Aug  8 11:23:32 2020 - [info] GTID failover mode = 1
Sat Aug  8 11:23:32 2020 - [info] Dead Servers:
Sat Aug  8 11:23:32 2020 - [info] Alive Servers:
Sat Aug  8 11:23:32 2020 - [info]   192.168.68.131(192.168.68.131:3306)
Sat Aug  8 11:23:32 2020 - [info]   192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:23:32 2020 - [info]   192.168.68.133(192.168.68.133:3306)
Sat Aug  8 11:23:32 2020 - [info] Alive Slaves:
Sat Aug  8 11:23:32 2020 - [info]   192.168.68.131(192.168.68.131:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:23:32 2020 - [info]     GTID ON
Sat Aug  8 11:23:32 2020 - [info]     Replicating from 192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:23:32 2020 - [info]   192.168.68.133(192.168.68.133:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:23:32 2020 - [info]     GTID ON
Sat Aug  8 11:23:32 2020 - [info]     Replicating from 192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:23:32 2020 - [info] Current Alive Master: 192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:23:32 2020 - [info] Checking slave configurations..
Sat Aug  8 11:23:32 2020 - [info]  read_only=1 is not set on slave 192.168.68.131(192.168.68.131:3306).
Sat Aug  8 11:23:32 2020 - [info]  read_only=1 is not set on slave 192.168.68.133(192.168.68.133:3306).
Sat Aug  8 11:23:32 2020 - [info] Checking replication filtering settings..
Sat Aug  8 11:23:32 2020 - [info]  binlog_do_db= , binlog_ignore_db= information_schema,mysql,performance_schema,sys
Sat Aug  8 11:23:32 2020 - [info]  Replication filtering check ok.
Sat Aug  8 11:23:32 2020 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Sat Aug  8 11:23:32 2020 - [info] Checking SSH publickey authentication settings on the current master..
Sat Aug  8 11:23:32 2020 - [info] HealthCheck: SSH to 192.168.68.132 is reachable.
Sat Aug  8 11:23:32 2020 - [info] 
192.168.68.132(192.168.68.132:3306) (current master)
 +--192.168.68.131(192.168.68.131:3306)
 +--192.168.68.133(192.168.68.133:3306)

Sat Aug  8 11:23:32 2020 - [info] Checking replication health on 192.168.68.131..
Sat Aug  8 11:23:32 2020 - [info]  ok.
Sat Aug  8 11:23:32 2020 - [info] Checking replication health on 192.168.68.133..
Sat Aug  8 11:23:32 2020 - [info]  ok.
Sat Aug  8 11:23:32 2020 - [info] Checking master_ip_failover_script status:
Sat Aug  8 11:23:32 2020 - [info]   /usr/local/mha/scripts/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.68.132 --orig_master_ip=192.168.68.132 --orig_master_port=3306 


IN SCRIPT TEST====/sbin/ip addr del 192.168.68.135/24 dev eth0==/sbin/ifconfig eth0:1 192.168.68.135/24===

Checking the Status of the script.. OK 
Sat Aug  8 11:23:32 2020 - [info]  OK.
Sat Aug  8 11:23:32 2020 - [warning] shutdown_script is not defined.
Sat Aug  8 11:23:32 2020 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
[root@MHA-LHR-Monitor-ip134 /]# masterha_master_switch --conf=/etc/mha/mha.cnf  --master_state=alive \
> --orig_master_is_new_slave --running_updates_limit=10000 --interactive=0 \
> --new_master_host=192.168.68.131 --new_master_port=3306
Sat Aug  8 11:26:36 2020 - [info] MHA::MasterRotate version 0.58.
Sat Aug  8 11:26:36 2020 - [info] Starting online master switch..
Sat Aug  8 11:26:36 2020 - [info] 
Sat Aug  8 11:26:36 2020 - [info] * Phase 1: Configuration Check Phase..
Sat Aug  8 11:26:36 2020 - [info] 
Sat Aug  8 11:26:36 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Aug  8 11:26:36 2020 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Sat Aug  8 11:26:36 2020 - [info] Reading server configuration from /etc/mha/mha.cnf..
Sat Aug  8 11:26:37 2020 - [info] GTID failover mode = 1
Sat Aug  8 11:26:37 2020 - [info] Current Alive Master: 192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:26:37 2020 - [info] Alive Slaves:
Sat Aug  8 11:26:37 2020 - [info]   192.168.68.131(192.168.68.131:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:26:37 2020 - [info]     GTID ON
Sat Aug  8 11:26:37 2020 - [info]     Replicating from 192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:26:37 2020 - [info]   192.168.68.133(192.168.68.133:3306)  Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sat Aug  8 11:26:37 2020 - [info]     GTID ON
Sat Aug  8 11:26:37 2020 - [info]     Replicating from 192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:26:37 2020 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Sat Aug  8 11:26:37 2020 - [info]  ok.
Sat Aug  8 11:26:37 2020 - [info] Checking MHA is not monitoring or doing failover..
Sat Aug  8 11:26:37 2020 - [info] Checking replication health on 192.168.68.131..
Sat Aug  8 11:26:37 2020 - [info]  ok.
Sat Aug  8 11:26:37 2020 - [info] Checking replication health on 192.168.68.133..
Sat Aug  8 11:26:37 2020 - [info]  ok.
Sat Aug  8 11:26:37 2020 - [info] 192.168.68.131 can be new master.
Sat Aug  8 11:26:37 2020 - [info] 
From:
192.168.68.132(192.168.68.132:3306) (current master)
 +--192.168.68.131(192.168.68.131:3306)
 +--192.168.68.133(192.168.68.133:3306)

To:
192.168.68.131(192.168.68.131:3306) (new master)
 +--192.168.68.133(192.168.68.133:3306)
 +--192.168.68.132(192.168.68.132:3306)
Sat Aug  8 11:26:37 2020 - [info] Checking whether 192.168.68.131(192.168.68.131:3306) is ok for the new master..
Sat Aug  8 11:26:37 2020 - [info]  ok.
Sat Aug  8 11:26:37 2020 - [info] 192.168.68.132(192.168.68.132:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.
Sat Aug  8 11:26:37 2020 - [info] 192.168.68.132(192.168.68.132:3306): Resetting slave pointing to the dummy host.
Sat Aug  8 11:26:37 2020 - [info] ** Phase 1: Configuration Check Phase completed.
Sat Aug  8 11:26:37 2020 - [info] 
Sat Aug  8 11:26:37 2020 - [info] * Phase 2: Rejecting updates Phase..
Sat Aug  8 11:26:37 2020 - [info] 
Sat Aug  8 11:26:37 2020 - [info] Executing master ip online change script to disable write on the current master:
Sat Aug  8 11:26:37 2020 - [info]   /usr/local/mha/scripts/master_ip_online_change --command=stop --orig_master_host=192.168.68.132 --orig_master_ip=192.168.68.132 --orig_master_port=3306 --orig_master_user='mha' --new_master_host=192.168.68.131 --new_master_ip=192.168.68.131 --new_master_port=3306 --new_master_user='mha' --orig_master_ssh_user=root --new_master_ssh_user=root   --orig_master_is_new_slave --orig_master_password=xxx --new_master_password=xxx
Sat Aug  8 11:26:37 2020 758267 Set read_only on the new master.. ok.
Sat Aug  8 11:26:37 2020 763087 Waiting all running 2 threads are disconnected.. (max 1500 milliseconds)
{'Time' => '1507','db' => undef,'Id' => '14','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.68.133:60218'}
{'Time' => '227','db' => undef,'Id' => '18','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.68.131:60292'}
Sat Aug  8 11:26:38 2020 267483 Waiting all running 2 threads are disconnected.. (max 1000 milliseconds)
{'Time' => '1508','db' => undef,'Id' => '14','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.68.133:60218'}
{'Time' => '228','db' => undef,'Id' => '18','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.68.131:60292'}
Sat Aug  8 11:26:38 2020 771131 Waiting all running 2 threads are disconnected.. (max 500 milliseconds)
{'Time' => '1508','db' => undef,'Id' => '14','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.68.133:60218'}
{'Time' => '228','db' => undef,'Id' => '18','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.68.131:60292'}
Sat Aug  8 11:26:39 2020 274787 Set read_only=1 on the orig master.. ok.
Sat Aug  8 11:26:39 2020 276192 Waiting all running 2 queries are disconnected.. (max 500 milliseconds)
{'Time' => '1509','db' => undef,'Id' => '14','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.68.133:60218'}
{'Time' => '229','db' => undef,'Id' => '18','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.68.131:60292'}
Sat Aug  8 11:26:39 2020 778101 Killing all application threads..
Sat Aug  8 11:26:39 2020 778804 done.
Disabling the VIP an old master: 192.168.68.132 
Warning: Executing wildcard deletion to stay compatible with old scripts.
         Explicitly specify the prefix length (192.168.68.135/32) to avoid this warning.
         This special behaviour is likely to disappear in further releases,
         fix your scripts!
Sat Aug  8 11:26:39 2020 - [info]  ok.
Sat Aug  8 11:26:39 2020 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Sat Aug  8 11:26:39 2020 - [info] Executing FLUSH TABLES WITH READ LOCK..
Sat Aug  8 11:26:39 2020 - [info]  ok.
Sat Aug  8 11:26:39 2020 - [info] Orig master binlog:pos is MHA-LHR-Slave1-ip132-bin.000008:234.
Sat Aug  8 11:26:39 2020 - [info]  Waiting to execute all relay logs on 192.168.68.131(192.168.68.131:3306)..
Sat Aug  8 11:26:39 2020 - [info]  master_pos_wait(MHA-LHR-Slave1-ip132-bin.000008:234) completed on 192.168.68.131(192.168.68.131:3306). Executed 0 events.
Sat Aug  8 11:26:39 2020 - [info]   done.
Sat Aug  8 11:26:39 2020 - [info] Getting new master's binlog name and position..
Sat Aug  8 11:26:39 2020 - [info]  MHA-LHR-Master1-ip131-bin.000013:234
Sat Aug  8 11:26:39 2020 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.68.131', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Sat Aug  8 11:26:39 2020 - [info] Executing master ip online change script to allow write on the new master:
Sat Aug  8 11:26:39 2020 - [info]   /usr/local/mha/scripts/master_ip_online_change --command=start --orig_master_host=192.168.68.132 --orig_master_ip=192.168.68.132 --orig_master_port=3306 --orig_master_user='mha' --new_master_host=192.168.68.131 --new_master_ip=192.168.68.131 --new_master_port=3306 --new_master_user='mha' --orig_master_ssh_user=root --new_master_ssh_user=root   --orig_master_is_new_slave --orig_master_password=xxx --new_master_password=xxx
Sat Aug  8 11:26:40 2020 027564 Set read_only=0 on the new master.
Enabling the VIP 192.168.68.135 on the new master: 192.168.68.131 
Sat Aug  8 11:26:42 2020 - [info]  ok.
Sat Aug  8 11:26:42 2020 - [info] 
Sat Aug  8 11:26:42 2020 - [info] * Switching slaves in parallel..
Sat Aug  8 11:26:42 2020 - [info] 
Sat Aug  8 11:26:42 2020 - [info] -- Slave switch on host 192.168.68.133(192.168.68.133:3306) started, pid: 640
Sat Aug  8 11:26:42 2020 - [info] 
Sat Aug  8 11:26:44 2020 - [info] Log messages from 192.168.68.133 ...
Sat Aug  8 11:26:44 2020 - [info] 
Sat Aug  8 11:26:42 2020 - [info]  Waiting to execute all relay logs on 192.168.68.133(192.168.68.133:3306)..
Sat Aug  8 11:26:42 2020 - [info]  master_pos_wait(MHA-LHR-Slave1-ip132-bin.000008:234) completed on 192.168.68.133(192.168.68.133:3306). Executed 0 events.
Sat Aug  8 11:26:42 2020 - [info]   done.
Sat Aug  8 11:26:42 2020 - [info]  Resetting slave 192.168.68.133(192.168.68.133:3306) and starting replication from the new master 192.168.68.131(192.168.68.131:3306)..
Sat Aug  8 11:26:42 2020 - [info]  Executed CHANGE MASTER.
Sat Aug  8 11:26:43 2020 - [info]  Slave started.
Sat Aug  8 11:26:44 2020 - [info] End of log messages from 192.168.68.133 ...
Sat Aug  8 11:26:44 2020 - [info] 
Sat Aug  8 11:26:44 2020 - [info] -- Slave switch on host 192.168.68.133(192.168.68.133:3306) succeeded.
Sat Aug  8 11:26:44 2020 - [info] Unlocking all tables on the orig master:
Sat Aug  8 11:26:44 2020 - [info] Executing UNLOCK TABLES..
Sat Aug  8 11:26:44 2020 - [info]  ok.
Sat Aug  8 11:26:44 2020 - [info] Starting orig master as a new slave..
Sat Aug  8 11:26:44 2020 - [info]  Resetting slave 192.168.68.132(192.168.68.132:3306) and starting replication from the new master 192.168.68.131(192.168.68.131:3306)..
Sat Aug  8 11:26:44 2020 - [info]  Executed CHANGE MASTER.
Sat Aug  8 11:26:44 2020 - [info]  Slave started.
Sat Aug  8 11:26:44 2020 - [info] All new slave servers switched successfully.
Sat Aug  8 11:26:44 2020 - [info] 
Sat Aug  8 11:26:44 2020 - [info] * Phase 5: New master cleanup phase..
Sat Aug  8 11:26:44 2020 - [info] 
Sat Aug  8 11:26:44 2020 - [info]  192.168.68.131: Resetting slave info succeeded.
Sat Aug  8 11:26:44 2020 - [info] Switching master to 192.168.68.131(192.168.68.131:3306) completed successfully.

3.3 測試場景二：主庫故障手動轉移

測試場景一測試的是，在主庫故障後，MHA自動執行故障轉移動作。

測試場景二測試的是，在主庫故障後，MHA程式未啟動的情況下，我們手動來切換。這種情況為MySQL主從關係中主庫因為故障當機了，但是MHA Master監控並沒有開啟，這個時候就需要手動來failover了。該情況下，日誌列印輸出和自動failover是沒有什麼區別的。需要注意的是，如果主庫未當機，那麼不能手動執行故障切換，會報錯的。

# 關閉主庫
docker stop MHA-LHR-Master1-ip131

# 在134上執行手動切換
masterha_master_switch --conf=/etc/mha/mha.cnf --master_state=dead --ignore_last_failover --interactive=0 \
--dead_master_host=192.168.68.131 --dead_master_port=3306 \
--new_master_host=192.168.68.132 -―new_master_port=3306

接下來，宕掉的主庫需要手動恢復，這裡不再詳細演示。需要注意的是，手動切換也會傳送告警郵件。

3.4 mysql-utilities包

# 安裝mysql-utilities包，依賴於Python2.7，版本需要對應，否則報錯
rpm -e mysql-connector-python-2.1.8-1.el7.x86_64 --nodeps

#centos 7
rpm -Uvh http://repo.mysql.com/yum/mysql-connectors-community/el/7/x86_64/mysql-connector-python-1.1.6-1.el7.noarch.rpm
#centos 6
rpm -Uvh http://repo.mysql.com/yum/mysql-connectors-community/el/6/x86_64/mysql-connector-python-1.1.6-1.el6.noarch.rpm

yum install -y mysql-utilities

我的映象環境已安裝配置好了，直接執行即可，執行結果：

[root@MHA-LHR-Monitor-ip134 /]# mysqlrplshow --master=root:lhr@192.168.68.131:3306 --discover-slaves-login=root:lhr --verbose
# master on 192.168.68.131: ... connected.
# Finding slaves for master: 192.168.68.131:3306

# Replication Topology Graph
192.168.68.131:3306 (MASTER)
   |
   +--- 192.168.68.132:3306 [IO: Yes, SQL: Yes] - (SLAVE)
   |
   +--- 192.168.68.133:3306 [IO: Yes, SQL: Yes] - (SLAVE)

[root@MHA-LHR-Monitor-ip134 /]# mysqlrplcheck --master=root:lhr@192.168.68.131:3306 --slave=root:lhr@192.168.68.132:3306 -v   
# master on 192.168.68.131: ... connected.
# slave on 192.168.68.132: ... connected.
Test Description                                                     Status
---------------------------------------------------------------------------
Checking for binary logging on master                                [pass]
Are there binlog exceptions?                                         [WARN]

+---------+--------+--------------------------------------------------+
| server  | do_db  | ignore_db                                        |
+---------+--------+--------------------------------------------------+
| master  |        | mysql,information_schema,performance_schema,sys  |
| slave   |        | information_schema,performance_schema,mysql,sys  |
+---------+--------+--------------------------------------------------+

Replication user exists?                                             [pass]
Checking server_id values                                            [pass]

 master id = 573306131
  slave id = 573306132

Checking server_uuid values                                          [pass]

 master uuid = c8ca4f1d-aec3-11ea-942b-0242c0a84483
  slave uuid = d24a77d1-aec3-11ea-9399-0242c0a84484

Is slave connected to master?                                        [pass]
Check master information file                                        [WARN]

Cannot read master information file from a remote machine.

Checking InnoDB compatibility                                        [pass]
Checking storage engines compatibility                               [pass]
Checking lower_case_table_names settings                             [pass]

  Master lower_case_table_names: 0
   Slave lower_case_table_names: 0

Checking slave delay (seconds behind master)                         [pass]
# ...done.

有關MHA的常見測試就結束了，更詳細的內容請諮詢小麥苗的MySQL DBA課程。

About Me

● 本文作者：小麥苗，部分內容整理自網路，若有侵權請聯絡小麥苗刪除
● 本文在個人微信公眾號（DB寶）上有同步更新
● QQ群號： 230161599 、618766405，微信群私聊
● 個人QQ號（646634621），微訊號（db_bao），註明新增緣由
● 於 2021年2月在西安完成
● 最新修改時間：2021年2月
● 版權所有，歡迎分享本文，轉載請保留出處

●小麥苗的微店： https://weidian.com/?userid=793741433
●小麥苗出版的資料庫類叢書： http://blog.itpub.net/26736162/viewspace-2142121/
●小麥苗OCP、OCM、高可用、DBA學習班（Oracle、MySQL、NoSQL）： http://blog.itpub.net/26736162/viewspace-2148098/
●資料庫筆試面試題庫及解答： https://mp.weixin.qq.com/s/Vm5PqNcDcITkOr9cQg6T7w

使用微信客戶端掃描下面的二維碼來關注小麥苗的微信公眾號（DB寶）及QQ群（DBA寶典）、新增小麥苗微信，學習最實用的資料庫技術。

【DB寶19】MySQL高可用之MHA功能測試
2021-02-19
MySql
【DB寶18】在Docker中安裝使用MySQL高可用之MGR
2020-08-18
DockerMySql
【DB寶45】MySQL高可用之MGR+Consul架構部署
2021-03-29
MySql架構
MySQL高可用之MHA切換測試（switchover & failover）
2018-08-02
MySqlAI
【DB寶3】在Docker中使用rpm包的方式安裝Oracle 19c
2020-07-11
DockerOracle
【DB寶42】MySQL高可用架構MHA+ProxySQL實現讀寫分離和負載均衡
2021-03-15
MySql架構負載
【DB寶39】使用Docker分分鐘搭建Zabbix 5.0配置MySQL監控
2021-02-01
DockerMySql
Mysql 5.7 MHA 高可用
2022-08-23
MySql
mysql高可用之keepalived
2024-08-05
MySql
基於 MHA 高可用的 MySQL
2019-08-26
MySql
mysql高可用架構MHA搭建
2020-09-19
MySql架構
MySQL高可用之GC-Galera Cluster for MySQL
2020-02-20
MySqlGC
MHA+MySQL主從配置實現MySQL高可用
2018-07-10
MySql
MySQL 實現高可用架構之 MHA
2021-07-21
MySql架構
MySQL高可用架構-MMM、MHA、MGR、PXC
2021-10-02
MySql架構
Mysql 高可用(MHA)-讀寫分離(Atlas)
2021-02-26
MySql
mysql高可用架構MHA搭建（centos7+mysql5.7.28）
2020-07-05
MySql架構CentOS
【DB寶40】MySQL高可用管理工具Orchestrator簡介及測試
2021-02-22
MySql
【DB寶36】使用Docker分分鐘搭建漂亮的prometheus+grafana監控
2021-01-29
DockerPrometheusGrafana
瀚高DB相容MySQL if函式
2021-12-22
MySql函式
Mysql MHA部署-04MHA配置
2020-03-15
MySql
【MySQL】在Docker中快速部署PXC
2020-04-03
MySqlDocker
MySQL高可用之MGC--MariaDB Galera Cluster
2020-02-20
MySqlGC
【DB寶60】PG12高可用之1主2從流複製環境搭建及切換測試
2024-05-20
MySQL高可用架構之MHA 原理與實踐
2019-01-16
MySql架構
MySQL——MHA高可用群集部署及故障測試
2020-11-03
MySql
MySQL高可用群集MHA部署及故障測試分析
2020-11-05
MySql
MySQL 高可用架構 - MHA環境部署記錄
2020-12-10
MySql架構
MySQL MHA部署 Part 5 MHA部署指南
2019-07-22
MySql
【DB寶35】使用MySQL 8.0 克隆(clone)外掛快速新增MGR節點
2021-01-29
MySql
【DB寶12】在Docker中只需2步即可擁有Oracle 12cR2(12.2.0.1)企業版環境
2020-07-11
DockerOracle
【DB寶4】只需2步即可擁有Oracle19c的ASM+DB環境
2020-07-11
OracleASM
構建MHA實現MySQL高可用叢集架構
2019-07-29
MySql架構
Markdown高階使用之流程圖
2020-12-31
流程圖
【DB寶41】監控利器PMM的使用--監控MySQL、PG、MongoDB、ProxySQL等
2021-03-02
MySqlMongoDB
MHA高可用+VIP漂移
2024-03-29
Mysql MHA部署-03MHA軟體安裝
2020-03-15
MySql
mysql5.7MHA配置
2024-03-11
MySql