【MHA】mysql高可用之MHA
mysql高可用MHA的搭建與測試
規劃
192.168.186.141 MYSQL.COM |
主: node |
192.168.186.142 SLAVE1.COM |
從1: node |
192.168.186.146 SLAVE2.COM |
從2: node |
192.168.186.144 MANAGER.COM | 管理機:manager node |
一:搭建mysql 主從複製,配置主機間的ssh互信(免密碼登陸)
注意點:在主上建立複製使用者的時候在從上也建立一遍搭建半同步複製:原因:使用半同步複製可以解決當主伺服器當掉後,無法ssh到主伺服器上儲存尚未傳送的二進位制日誌事件。
二:安裝node與manager
安裝perl環境(每臺機器都需要做的)-
[root@SLAVE2 data]# yum -y install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Config-IniFiles ncftp perl-Params-Validate perl-CPAN perl-Test-Mock-LWP.noarch perl-LWP-Authen-Negotiate.noarch perl-devel
- [root@SLAVE2 data]#yum install perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker 這個百度文件沒有但是顯然是要裝的
安裝node(每臺機器都要做的)
-
[root@MANAGER src]# tar -xf mha4mysql-node-0.56.tar.gz
-
[root@MANAGER src]# cd mha4mysql-node-0.53
-
[root@MANAGER mha4mysql-node-0.53]# perl Makefile.PL
- [root@MANAGER mha4mysql-node-0.53]# make && make install
安裝manager(管理機器上要安裝的,就是管理機器上同時存在node與manager )
點選(此處)摺疊或開啟
-
[root@MANAGER src]# tar -xf mha4mysql-manager-0.56.tar.gz
-
[root@MANAGER src]# cd mha4mysql-manager-0.53
-
[root@MANAGER mha4mysql-manager-0.53]# perl Makefile.PL
-
[root@MANAGER mha4mysql-manager-0.53]# make && make install
-
根據提示輸入 如果中間有卡頓現象 直接crtl+c 然後繼續會出現下載的進度條,說明是正常的
-
[root@MANAGER src]# mkdir /etc/masterha
-
[root@MANAGER mha]# mkdir -p /master/app1
-
[root@MANAGERmha]# mkdir -p /scripts
-
[root@MANAGER mha]# cp samples/conf/* /etc/masterha/ --複製配置檔案
-
[root@MANAGERmha]# cp samples/scripts/* /scripts
- [root@MANAGER mha4mysql-manager-0.53]# cp samples/conf/* /etc/masterha/
三:配置
-
[root@MANAGER masterha]# vi app1.cnf
-
內容如下;
-
[server default]
-
manager_workdir=/masterha/app1
-
manager_log=/masterha/app1/manager.log
-
user=mha_mon 監控使用者,每臺機器上都要建的
-
password=123
-
ssh_user=root ssh無密碼的使用者
-
repl_user=slave 做主從的使用者 每臺都要授權的
-
repl_password=yunwei123 做主從的密碼
-
ping_interval=1
- shutdown_script=""
-
report_script="" ###切換時發郵件
-
####master_ip_failover_script=/usr/local/bin/master_ip_failover ###自動切換
####master_ip_online_change_script=/usr/local/bin/master_ip_online_change_script ###手工切換,瞭解這兩個指令碼可參考http://blog.csdn.net/largetalk/article/details/10006899
####secondary_check_script= /usr/local/bin/masterha_secondary_check -s node1 -s node2 --user=root --master_host=node2 --master_ip=192.168.6.115 --master_port=3306
####//一旦MHA到node2(主)的監控之間出現問題,MHA Manager將會嘗試從node1(從,主機名)登入到node2(主,主機名),檢查更詳細
-
[server1]
-
hostname=192.168.186.141
-
master_binlog_dir=/data/mysql/data ####binlog目錄
-
candidate_master=1
-
-
[server2]
-
hostname=192.168.186.142
-
master_binlog_dir=/data/mysql/data ####binlog目錄
-
candidate_master=1 ---備主,當master當掉之後會自動切換這臺為主機
-
-
[server3]
-
hostname=192.168.186.146
-
master_binlog_dir=/data/mysql/data ####binlog目錄
-
no_master=1 -----表示永不做主
-
-
儲存退出!
- [root@MANAGER masterha]# >masterha_default.cnf
-
登入每臺資料庫 mysql> grant all privileges on *.* to mha_mon@'%' identified by '123'; Query OK, 0 rows affected (1.00 sec) mysql> flush privileges; Query OK, 0 rows affected (0.01 sec)
四:測試ssh,與複製
測試ssh-
[root@MANAGER masterha]# masterha_check_ssh --global_conf=/etc/masterha/masterha_default.cnf --conf=/etc/masterha/app1.cnf
-
Wed Jul 9 02:26:57 2014 - [info] Reading default configuratoins from /etc/masterha/masterha_default.cnf..
-
Wed Jul 9 02:26:57 2014 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
-
Wed Jul 9 02:26:57 2014 - [info] Reading server configurations from /etc/masterha/app1.cnf..
-
Wed Jul 9 02:26:57 2014 - [info] Starting SSH connection tests..
-
Wed Jul 9 02:26:58 2014 - [debug]
-
Wed Jul 9 02:26:57 2014 - [debug] Connecting via SSH from root@192.168.186.141(192.168.186.141:22) to root@192.168.186.142(192.168.186.142:22)..
-
Wed Jul 9 02:26:57 2014 - [debug] ok.
-
Wed Jul 9 02:26:57 2014 - [debug] Connecting via SSH from root@192.168.186.141(192.168.186.141:22) to root@192.168.186.146(192.168.186.146:22)..
-
Wed Jul 9 02:26:57 2014 - [debug] ok.
-
Wed Jul 9 02:26:58 2014 - [debug]
-
Wed Jul 9 02:26:57 2014 - [debug] Connecting via SSH from root@192.168.186.142(192.168.186.142:22) to root@192.168.186.141(192.168.186.141:22)..
-
Wed Jul 9 02:26:57 2014 - [debug] ok.
-
Wed Jul 9 02:26:57 2014 - [debug] Connecting via SSH from root@192.168.186.142(192.168.186.142:22) to root@192.168.186.146(192.168.186.146:22)..
-
-
Wed Jul 9 02:26:58 2014 - [debug] ok.
-
Wed Jul 9 02:26:58 2014 - [debug]
-
Wed Jul 9 02:26:58 2014 - [debug] Connecting via SSH from root@192.168.186.146(192.168.186.146:22) to root@192.168.186.141(192.168.186.141:22)..
-
Wed Jul 9 02:26:58 2014 - [debug] ok.
-
Wed Jul 9 02:26:58 2014 - [debug] Connecting via SSH from root@192.168.186.146(192.168.186.146:22) to root@192.168.186.142(192.168.186.142:22)..
-
Wed Jul 9 02:26:58 2014 - [debug] ok.
- Wed Jul 9 02:26:58 2014 - [info] All SSH connection tests passed successfully.
注意
[root@SLAVE1 ~]# ln -s /usr/local/mysql/bin/* /usr/bin ---rpm安裝的mysql不需要,原始碼安裝的就這樣
在每臺 MYSQL 伺服器上做這件事情 極度重要哦
mysql>set global read_only=1; set global relay_log_purge=0; --read-only的意義是普通使用者沒有create,update,insert等修改的許可權,而具有super-privs超級許可權(all on *.*)的是不受這個引數約束的!
在備用節點和從節點的/etc/mysql/my.cnf中加入選項:
read_only=1 #這個設定待商榷,備選master設為read only之後,master轉移到備選master後不可寫(有super許可權的使用者還是可寫,切換到新的主庫上後,會在新的主庫上執行 SET GLOBAL read_only =0 ),,這個引數開放設定,根據環境來設定
relay_log_purge=0(這個值主從上都寫上,因為mha主從補齊資料會用到relay_log)
slave-skip-errors=1396
為什麼要跳過這個錯誤呢 因為啊在主裡面刪除使用者的時候 從會報錯說沒有這個使用者所以跳過這個錯誤吧
如果資料庫存在空的使用者 域名的使用者 一定要刪除否則 MHA 連線 MYSQL 會報錯連不上 一般只要在從上面刪除 如果直接沒刪除也 OK 那就 OK 吧如果報錯登入不了就刪除掉吧 或者跳過域名解析,或者你授權的時候記得也授權域名等等方法多種 我的方式是跳過域名解析的 skip-name-resolve
mysql> select user,host from mysql.user; +---------+---------------+ | user | host | +---------+---------------+ | root | 127.0.0.1 | | mha_mon | 192.168.186.% | | repl | 192.168.186.% | | slave | 192.168.186.% | | root | ::1 | | | SLAVE2.COM | | root | SLAVE2.COM | | root | localhost | +---------+---------------+ 8 rows in set (0.00 sec) mysql> drop user 'root'@SLAVE2.COM; Query OK, 0 rows affected (0.00 sec)
測試複製程式
點選(此處)摺疊或開啟
-
[root@MANAGER masterha]# masterha_check_repl --conf=/etc/masterha/app1.cnf
-
Wed Jul 9 04:23:16 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
-
Wed Jul 9 04:23:16 2014 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
-
Wed Jul 9 04:23:16 2014 - [info] Reading server configurations from /etc/masterha/app1.cnf..
-
Wed Jul 9 04:23:16 2014 - [info] MHA::MasterMonitor version 0.53.
-
Wed Jul 9 04:23:17 2014 - [info] Dead Servers:
-
Wed Jul 9 04:23:17 2014 - [info] Alive Servers:
-
Wed Jul 9 04:23:17 2014 - [info] 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 04:23:17 2014 - [info] 192.168.186.142(192.168.186.142:3306)
-
Wed Jul 9 04:23:17 2014 - [info] SLAVE2.COM(192.168.186.146:3306)
-
Wed Jul 9 04:23:17 2014 - [info] Alive Slaves:
-
Wed Jul 9 04:23:17 2014 - [info] 192.168.186.142(192.168.186.142:3306) Version=5.6.10-log (oldest major version between slaves) log-bin:enabled
-
Wed Jul 9 04:23:17 2014 - [info] Replicating from 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 04:23:17 2014 - [info] Primary candidate for the new Master (candidate_master is set)
-
Wed Jul 9 04:23:17 2014 - [info] SLAVE2.COM(192.168.186.146:3306) Version=5.6.10-log (oldest major version between slaves) log-bin:enabled
-
Wed Jul 9 04:23:17 2014 - [info] Replicating from 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 04:23:17 2014 - [info] Not candidate for the new Master (no_master is set)
-
Wed Jul 9 04:23:17 2014 - [info] Current Alive Master: 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 04:23:17 2014 - [info] Checking slave configurations..
-
Wed Jul 9 04:23:17 2014 - [info] Checking replication filtering settings..
-
Wed Jul 9 04:23:17 2014 - [info] binlog_do_db= , binlog_ignore_db=
-
Wed Jul 9 04:23:17 2014 - [info] Replication filtering check ok.
-
Wed Jul 9 04:23:17 2014 - [info] Starting SSH connection tests..
-
Wed Jul 9 04:23:18 2014 - [info] All SSH connection tests passed successfully.
-
Wed Jul 9 04:23:18 2014 - [info] Checking MHA Node version..
-
Wed Jul 9 04:23:19 2014 - [info] Version check ok.
-
Wed Jul 9 04:23:19 2014 - [info] Checking SSH publickey authentication settings on the current master..
-
Wed Jul 9 04:23:19 2014 - [info] HealthCheck: SSH to 192.168.186.141 is reachable.
-
Wed Jul 9 04:23:19 2014 - [info] Master MHA Node version is 0.53.
-
Wed Jul 9 04:23:19 2014 - [info] Checking recovery script configurations on the current master..
-
Wed Jul 9 04:23:19 2014 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql/data --output_file=/var/tmp/save_binary_logs_test --manager_version=0.53 --start_file=mysql-bin.000001
-
Wed Jul 9 04:23:19 2014 - [info] Connecting to root@192.168.186.141(192.168.186.141)..
-
Creating /var/tmp if not exists.. ok.
-
Checking output directory is accessible or not..
-
ok.
-
Binlog found at /data/mysql/data, up to mysql-bin.000001
-
Wed Jul 9 04:23:20 2014 - [info] Master setting check done.
-
Wed Jul 9 04:23:20 2014 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
-
Wed Jul 9 04:23:20 2014 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=mha_mon --slave_host=192.168.186.142 --slave_ip=192.168.186.142 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.10-log --manager_version=0.53 --relay_log_info=/data/mysql/data/relay-log.info --relay_dir=/data/mysql/data/ --slave_pass=xxx
-
Wed Jul 9 04:23:20 2014 - [info] Connecting to root@192.168.186.142(192.168.186.142:22)..
-
Checking slave recovery environment settings..
-
Opening /data/mysql/data/relay-log.info ... ok.
-
Relay log found at /data/mysql/data, up to SLAVE1-relay-bin.000002
-
Temporary relay log file is /data/mysql/data/SLAVE1-relay-bin.000002
-
Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
-
done.
-
Testing mysqlbinlog output.. done.
-
Cleaning up test file(s).. done.
-
Wed Jul 9 04:23:20 2014 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=mha_mon --slave_host=SLAVE2.COM --slave_ip=192.168.186.146 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.10-log --manager_version=0.53 --relay_log_info=/data/mysql/data/relay-log.info --relay_dir=/data/mysql/data/ --slave_pass=xxx
-
Wed Jul 9 04:23:20 2014 - [info] Connecting to root@192.168.186.146(SLAVE2.COM:22)..
-
Checking slave recovery environment settings..
-
Opening /data/mysql/data/relay-log.info ... ok.
-
Relay log found at /data/mysql/data, up to slave2-relay-bin.000002
-
Temporary relay log file is /data/mysql/data/slave2-relay-bin.000002
-
Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
-
done.
-
Testing mysqlbinlog output.. done.
-
Cleaning up test file(s).. done.
-
Wed Jul 9 04:23:21 2014 - [info] Slaves settings check done.
-
Wed Jul 9 04:23:21 2014 - [info]
-
192.168.186.141 (current master)
-
+--192.168.186.142
-
+--SLAVE2.COM
-
-
Wed Jul 9 04:23:21 2014 - [info] Checking replication health on 192.168.186.142..
-
Wed Jul 9 04:23:21 2014 - [info] ok.
-
Wed Jul 9 04:23:21 2014 - [info] Checking replication health on SLAVE2.COM..
-
Wed Jul 9 04:23:21 2014 - [info] ok.
-
Wed Jul 9 04:23:21 2014 - [warning] master_ip_failover_script is not defined.
-
Wed Jul 9 04:23:21 2014 - [warning] shutdown_script is not defined.
-
Wed Jul 9 04:23:21 2014 - [info] Got exit code 0 (Not master dead).
-
-
MySQL Replication Health is OK.
-
- 至此說明你的MHA 已經配置好了
五:啟動/關閉mha
-
[root@dg55 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &
-
[1] 25516
-
[root@dg55 ~]# masterha_check_status --conf=/etc/masterha/app1.cnf --檢查一下
- app1 (pid:25516) is running(0:PING_OK), master:192.168.123.13
-
-
[root@node4 app1]#masterha_stop --conf=/etc/app1.cnf --關閉
# 如果不能停止, 加 --abort選項
六測試故障轉移
將 MYSQL.COM 機器上的 MYSQL 服務關閉 ,注意觀察 manager.log 日誌會發現 切換到了 SLAVE1.COM 並且 SLAVE1.COM 變成了主 而 SLAVE2.COM 則變成了 SLAVE1.COM 的從
點選(此處)摺疊或開啟
-
root@MANAGER app1]# tail -f manager.log 這是啟動後還沒關閉主資料庫的日誌內容
-
192.168.186.141 (current master)
-
+--192.168.186.142
-
+--SLAVE2.COM
-
-
Wed Jul 9 18:52:32 2014 - [warning] master_ip_failover_script is not defined.
-
Wed Jul 9 18:52:32 2014 - [warning] shutdown_script is not defined.
-
Wed Jul 9 18:52:32 2014 - [info] Set master ping interval 1 seconds.
-
Wed Jul 9 18:52:32 2014 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
-
Wed Jul 9 18:52:32 2014 - [info] Starting ping health check on 192.168.186.141(192.168.186.141:3306)..
-
Wed Jul 9 18:52:32 2014 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
-
-
[root@MYSQL ~]# service mysqld stop
-
Shutting down MySQL..... SUCCESS!
-
-
-
[root@MANAGER app1]# tail -f manager.log 最要看最後幾行 就知道有沒有切換成功
-
192.168.186.141 (current master)
-
+--192.168.186.142
-
+--SLAVE2.COM
-
Wed Jul 9 18:56:47 2014 - [info] Dead Servers:
-
Wed Jul 9 18:56:47 2014 - [info] 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 18:56:47 2014 - [info] Alive Servers:
-
Wed Jul 9 18:56:47 2014 - [info] 192.168.186.142(192.168.186.142:3306)
-
Wed Jul 9 18:56:47 2014 - [info] SLAVE2.COM(192.168.186.146:3306)
-
Wed Jul 9 18:56:47 2014 - [info] Alive Slaves:
-
Wed Jul 9 18:56:47 2014 - [info] 192.168.186.142(192.168.186.142:3306) Version=5.6.10-log (oldest major version between slaves) log-bin:enabled
-
Wed Jul 9 18:56:47 2014 - [info] Replicating from 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 18:56:47 2014 - [info] Primary candidate for the new Master (candidate_master is set)
-
Wed Jul 9 18:56:47 2014 - [info] SLAVE2.COM(192.168.186.146:3306) Version=5.6.10-log (oldest major version between slaves) log-bin:enabled
-
Wed Jul 9 18:56:47 2014 - [info] Replicating from 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 18:56:47 2014 - [info] Not candidate for the new Master (no_master is set)
-
Wed Jul 9 18:56:47 2014 - [info] Checking slave configurations..
-
Wed Jul 9 18:56:47 2014 - [info] Checking replication filtering settings..
-
Wed Jul 9 18:56:47 2014 - [info] Replication filtering check ok.
-
Wed Jul 9 18:56:47 2014 - [info] Master is down!
-
Wed Jul 9 18:56:47 2014 - [info] Terminating monitoring script.
-
Wed Jul 9 18:56:47 2014 - [info] Got exit code 20 (Master dead).
-
Wed Jul 9 18:56:47 2014 - [info] MHA::MasterFailover version 0.53.
-
Wed Jul 9 18:56:47 2014 - [info] Starting master failover.
-
Wed Jul 9 18:56:47 2014 - [info] * Phase 2: Dead Master Shutdown Phase completed.
-
Wed Jul 9 18:56:47 2014 - [info] 192.168.186.142(192.168.186.142:3306) Version=5.6.10-log (oldest major version between slaves) log-bin:enabled
-
Wed Jul 9 18:56:47 2014 - [info] Replicating from 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 18:56:47 2014 - [info] Primary candidate for the new Master (candidate_master is set)
-
Wed Jul 9 18:56:47 2014 - [info] SLAVE2.COM(192.168.186.146:3306) Version=5.6.10-log (oldest major version between slaves) log-bin:enabled
-
Wed Jul 9 18:56:47 2014 - [info] Replicating from 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 18:56:47 2014 - [info] Not candidate for the new Master (no_master is set)
-
Wed Jul 9 18:56:47 2014 - [info] The oldest binary log file/position on all slaves is mysql-bin.000001:214
-
Wed Jul 9 18:56:47 2014 - [info] Oldest slaves:
-
Wed Jul 9 18:56:47 2014 - [info] 192.168.186.142(192.168.186.142:3306) Version=5.6.10-log (oldest major version between slaves) log-bin:enabled
-
Wed Jul 9 18:56:47 2014 - [info] Replicating from 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 18:56:47 2014 - [info] Primary candidate for the new Master (candidate_master is set)
-
Wed Jul 9 18:56:47 2014 - [info] SLAVE2.COM(192.168.186.146:3306) Version=5.6.10-log (oldest major version between slaves) log-bin:enabled
-
Wed Jul 9 18:56:47 2014 - [info] Replicating from 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 18:56:47 2014 - [info] Not candidate for the new Master (no_master is set)
-
Wed Jul 9 18:56:47 2014 - [info]
-
Wed Jul 9 18:56:47 2014 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
-
Wed Jul 9 18:56:47 2014 - [info]
-
Wed Jul 9 18:56:48 2014 - [info] Fetching dead master's binary logs..
-
Wed Jul 9 18:56:48 2014 - [info] Executing command on the dead master 192.168.186.141(192.168.186.141:3306): save_binary_logs --command=save --start_file=mysql-bin.000001 --start_pos=214 --binlog_dir=/data/mysql/data --output_file=/var/tmp/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.53
-
Creating /var/tmp if not exists.. ok.
-
Concat binary/relay logs from mysql-bin.000001 pos 214 to mysql-bin.000001 EOF into /var/tmp/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog ..
-
Dumping binlog format description event, from position 0 to 120.. ok.
-
Dumping effective binlog data from /data/mysql/data/mysql-bin.000001 position 214 to tail(237).. ok.
-
Concat succeeded.
-
Wed Jul 9 18:56:48 2014 - [info] scp from root@192.168.186.141:/var/tmp/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog to local:/masterha/app1/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog succeeded.
-
Wed Jul 9 18:56:49 2014 - [info] HealthCheck: SSH to 192.168.186.142 is reachable.
-
Wed Jul 9 18:56:49 2014 - [info] HealthCheck: SSH to SLAVE2.COM is reachable.
-
Wed Jul 9 18:56:49 2014 - [info]
-
Wed Jul 9 18:56:49 2014 - [info] * Phase 3.3: Determining New Master Phase..
-
Wed Jul 9 18:56:49 2014 - [info]
-
Wed Jul 9 18:56:49 2014 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
-
Wed Jul 9 18:56:49 2014 - [info] All slaves received relay logs to the same position. No need to resync each other.
-
Wed Jul 9 18:56:49 2014 - [info] Searching new master from slaves..
-
Wed Jul 9 18:56:49 2014 - [info] Candidate masters from the configuration file:
-
Wed Jul 9 18:56:49 2014 - [info] 192.168.186.142(192.168.186.142:3306) Version=5.6.10-log (oldest major version between slaves) log-bin:enabled
-
Wed Jul 9 18:56:49 2014 - [info] Replicating from 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 18:56:49 2014 - [info] Primary candidate for the new Master (candidate_master is set)
-
Wed Jul 9 18:56:49 2014 - [info] Non-candidate masters:
-
Wed Jul 9 18:56:49 2014 - [info] SLAVE2.COM(192.168.186.146:3306) Version=5.6.10-log (oldest major version between slaves) log-bin:enabled
-
Wed Jul 9 18:56:49 2014 - [info] Replicating from 192.168.186.141(192.168.186.141:3306)
-
Wed Jul 9 18:56:49 2014 - [info] Not candidate for the new Master (no_master is set)
-
Wed Jul 9 18:56:49 2014 - [info] Searching from candidate_master slaves which have received the latest relay log events..
-
Wed Jul 9 18:56:49 2014 - [info] New master is 192.168.186.142(192.168.186.142:3306)
-
Wed Jul 9 18:56:49 2014 - [info] Starting master failover..
-
Wed Jul 9 18:56:49 2014 - [info]
-
From:
-
192.168.186.141 (current master)
-
+--192.168.186.142
-
+--SLAVE2.COM
-
-
To:
-
192.168.186.142 (new master)
-
+--SLAVE2.COM
-
Wed Jul 9 18:56:49 2014 - [info]
-
Wed Jul 9 18:56:49 2014 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
-
Wed Jul 9 18:56:49 2014 - [info]
-
Wed Jul 9 18:56:49 2014 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
-
Wed Jul 9 18:56:49 2014 - [info] Sending binlog..
-
Wed Jul 9 18:56:50 2014 - [info] scp from local:/masterha/app1/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog to root@192.168.186.142:/var/tmp/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog succeeded.
-
Wed Jul 9 18:56:50 2014 - [info]
-
Wed Jul 9 18:56:50 2014 - [info] * Phase 3.4: Master Log Apply Phase..
-
Wed Jul 9 18:56:50 2014 - [info]
-
Wed Jul 9 18:56:50 2014 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
-
Wed Jul 9 18:56:50 2014 - [info] Starting recovery on 192.168.186.142(192.168.186.142:3306)..
-
Wed Jul 9 18:56:50 2014 - [info] Generating diffs succeeded.
-
Wed Jul 9 18:56:50 2014 - [info] Waiting until all relay logs are applied.
-
Wed Jul 9 18:56:50 2014 - [info] done.
-
Wed Jul 9 18:56:50 2014 - [info] Getting slave status..
-
Wed Jul 9 18:56:50 2014 - [info] This slave(192.168.186.142)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000001:214). No need to recover from Exec_Master_Log_Pos.
-
Wed Jul 9 18:56:50 2014 - [info] Connecting to the target slave host 192.168.186.142, running recover script..
-
Wed Jul 9 18:56:50 2014 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=mha_mon --slave_host=192.168.186.142 --slave_ip=192.168.186.142 --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog --workdir=/var/tmp --target_version=5.6.10-log --timestamp=20140709185647 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.53 --slave_pass=xxx
-
Wed Jul 9 18:56:50 2014 - [info]
-
MySQL client version is 5.6.10. Using --binary-mode.
-
Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog on 192.168.186.142:3306. This may take long time...
-
Applying log files succeeded.
-
Wed Jul 9 18:56:50 2014 - [info] All relay logs were successfully applied.
-
Wed Jul 9 18:56:50 2014 - [info] Getting new master's binlog name and position..
-
Wed Jul 9 18:56:50 2014 - [info] mysql-bin.000007:504
-
Wed Jul 9 18:56:50 2014 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.186.142', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000007', MASTER_LOG_POS=504, MASTER_USER='repl', MASTER_PASSWORD='xxx';
-
Wed Jul 9 18:56:50 2014 - [warning] master_ip_failover_script is not set. Skipping taking over new master ip address.
-
Wed Jul 9 18:56:50 2014 - [info] Setting read_only=0 on 192.168.186.142(192.168.186.142:3306)..
-
Wed Jul 9 18:56:50 2014 - [info] ok.
-
Wed Jul 9 18:56:50 2014 - [info] ** Finished master recovery successfully.
-
Wed Jul 9 18:56:50 2014 - [info] * Phase 3: Master Recovery Phase completed.
-
Wed Jul 9 18:56:50 2014 - [info]
-
Wed Jul 9 18:56:50 2014 - [info] * Phase 4: Slaves Recovery Phase..
-
Wed Jul 9 18:56:50 2014 - [info]
-
Wed Jul 9 18:56:50 2014 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
-
Wed Jul 9 18:56:50 2014 - [info]
-
Wed Jul 9 18:56:50 2014 - [info] -- Slave diff file generation on host SLAVE2.COM(192.168.186.146:3306) started, pid: 3135. Check tmp log /masterha/app1/SLAVE2.COM_3306_20140709185647.log if it takes time..
-
Wed Jul 9 18:56:50 2014 - [info]
-
Wed Jul 9 18:56:50 2014 - [info] Log messages from SLAVE2.COM ...
-
Wed Jul 9 18:56:50 2014 - [info]
-
Wed Jul 9 18:56:50 2014 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
-
Wed Jul 9 18:56:50 2014 - [info] End of log messages from SLAVE2.COM.
-
Wed Jul 9 18:56:50 2014 - [info] -- SLAVE2.COM(192.168.186.146:3306) has the latest relay log events.
-
Wed Jul 9 18:56:50 2014 - [info] Generating relay diff files from the latest slave succeeded.
-
Wed Jul 9 18:56:50 2014 - [info]
-
Wed Jul 9 18:56:50 2014 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
-
Wed Jul 9 18:56:50 2014 - [info]
-
Wed Jul 9 18:56:50 2014 - [info] -- Slave recovery on host SLAVE2.COM(192.168.186.146:3306) started, pid: 3137. Check tmp log /masterha/app1/SLAVE2.COM_3306_20140709185647.log if it takes time..
-
Wed Jul 9 18:56:51 2014 - [info]
-
Wed Jul 9 18:56:51 2014 - [info] Log messages from SLAVE2.COM ...
-
Wed Jul 9 18:56:51 2014 - [info]
-
Wed Jul 9 18:56:50 2014 - [info] Sending binlog..
-
Wed Jul 9 18:56:51 2014 - [info] scp from local:/masterha/app1/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog to root@SLAVE2.COM:/var/tmp/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog succeeded.
-
Wed Jul 9 18:56:51 2014 - [info] Starting recovery on SLAVE2.COM(192.168.186.146:3306)..
-
Wed Jul 9 18:56:51 2014 - [info] Generating diffs succeeded.
-
Wed Jul 9 18:56:51 2014 - [info] Waiting until all relay logs are applied.
-
Wed Jul 9 18:56:51 2014 - [info] done.
-
Wed Jul 9 18:56:51 2014 - [info] Getting slave status..
-
Wed Jul 9 18:56:51 2014 - [info] This slave(SLAVE2.COM)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000001:214). No need to recover from Exec_Master_Log_Pos.
-
Wed Jul 9 18:56:51 2014 - [info] Connecting to the target slave host SLAVE2.COM, running recover script..
-
Wed Jul 9 18:56:51 2014 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=mha_mon --slave_host=SLAVE2.COM --slave_ip=192.168.186.146 --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog --workdir=/var/tmp --target_version=5.6.10-log --timestamp=20140709185647 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.53 --slave_pass=xxx
-
Wed Jul 9 18:56:51 2014 - [info]
-
MySQL client version is 5.6.10. Using --binary-mode.
-
Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.186.141_3306_20140709185647.binlog on SLAVE2.COM:3306. This may take long time...
-
Applying log files succeeded.
-
Wed Jul 9 18:56:51 2014 - [info] All relay logs were successfully applied.
-
Wed Jul 9 18:56:51 2014 - [info] Resetting slave SLAVE2.COM(192.168.186.146:3306) and starting replication from the new master 192.168.186.142(192.168.186.142:3306)..
-
Wed Jul 9 18:56:51 2014 - [info] Executed CHANGE MASTER.
-
Wed Jul 9 18:56:51 2014 - [info] Slave started.
-
Wed Jul 9 18:56:51 2014 - [info] End of log messages from SLAVE2.COM.
-
Wed Jul 9 18:56:51 2014 - [info] -- Slave recovery on host SLAVE2.COM(192.168.186.146:3306) succeeded.
-
Wed Jul 9 18:56:51 2014 - [info] All new slave servers recovered successfully.
-
Wed Jul 9 18:56:51 2014 - [info]
-
Wed Jul 9 18:56:51 2014 - [info] * Phase 5: New master cleanup phease..
-
Wed Jul 9 18:56:51 2014 - [info]
-
Wed Jul 9 18:56:51 2014 - [info] Resetting slave info on the new master..
-
Wed Jul 9 18:56:51 2014 - [info] 192.168.186.142: Resetting slave info succeeded.
-
Wed Jul 9 18:56:51 2014 - [info] Master failover to 192.168.186.142(192.168.186.142:3306) completed successfully.
-
Wed Jul 9 18:56:51 2014 - [info]
-
-
----- Failover Report -----
-
-
app1: MySQL Master failover 192.168.186.141 to 192.168.186.142 succeeded
-
-
Master 192.168.186.141 is
-
-
Check MHA Manager logs at MANAGER.COM:/masterha/app1/manager.log for details.
-
-
Started automated(non-interactive) failover.
-
The latest slave 192.168.186.142(192.168.186.142:3306) has all relay logs for recovery.
-
Selected 192.168.186.142 as a new master.
-
192.168.186.142: OK: Applying all logs succeeded.
-
SLAVE2.COM: This host has the latest relay log events.
-
Generating relay diff files from the latest slave succeeded.
-
SLAVE2.COM: OK: Applying all logs succeeded. Slave started, replicating from 192.168.186.142.
-
192.168.186.142: Resetting slave info succeeded.
-
192.168.186.142: Resetting slave info succeeded.
-
看到192.168.186.142: Resetting slave info succeeded.
-
-
Master failover to 192.168.186.142(192.168.186.142:3306) completed successfully. 說明切換成功了
-
-
[root@slave2 ~]# mysql -e "show slave status\G"
-
*************************** 1. row ***************************
-
Slave_IO_State: Waiting for master to send event
-
Master_Host: 192.168.186.142
-
Master_User: repl
-
Master_Port: 3306
-
Connect_Retry: 60
-
Master_Log_File: mysql-bin.000007
-
Read_Master_Log_Pos: 504
-
Relay_Log_File: slave2-relay-bin.000002
-
Relay_Log_Pos: 283
-
Relay_Master_Log_File: mysql-bin.000007
-
Slave_IO_Running: Yes
-
Slave_SQL_Running: Yes
- 看到已經切換到和142同步了 本來是和141同步的 此時的SALVE1.COM已經變成主了 說明已然生效
切換後舊master的修復及重新上線
master已經由192.168.17.199 3306 切到了192.168.17.200 3306 實際環境中資料是在不斷的變化的,而在切換點mha沒有記錄當時新master的log-file和log-pos 所以要想直接啟動192.168.17.199 3306 然後change master to 192.168.17.200 3306的話是不行的,只能對新主或slave2做一個全備然後再恢復再change。
另外,當執行切換後管理節點上的masterha_manager程式會自動stop,所以等修復好後要再次執行啟動----不要糾結ip不對,這個地方是個思路
重構
重構我想就不要我做了吧,就是這時候等於你的主掛了 切換在 SLAVE1.COM 上變成了主 因此重構我提供一種方案(方案多種),拿出一臺新的伺服器重新加入以 142為主做成從 再 app1.conf 配置檔案中加入 新機器的資訊
[root@MANAGER app1]# rm -rf app1.failover.complete刪除該檔案後 再次啟動 manager 端即可
七.擴充套件 Keepalived
-
[root@MYSQL src]# wget http://www.keepalived.org/software/keepalived-1.2.12.tar.gz
-
[root@MYSQL src]# tar -xf keepalived-1.2.12.tar.gz
-
[root@MYSQL src]# cd keepalived-1.2.12
-
[root@MYSQL src]# yum -y install gcc gcc-c++ gcc-g77 ncurses-devel bison libaio-devel cmake libnl* libpopt* popt-static openssl-devel
-
[root@MYSQL keepalived-1.2.12]# ./configure
-
[root@MYSQL keepalived-1.2.12]# make && make install
-
[root@MYSQL src]#mkdir /etc/keepalived/
-
-
[root@MYSQL src]# cp /usr/local/etc/keepalived/keepalived.conf /etc/keepalived/
-
[root@MYSQL src]# cp /usr/local/etc/rc.d/init.d/keepalived /etc/init.d/
-
[root@MYSQL src]# cp /usr/local/etc/sysconfig/keepalived /etc/sysconfig/
- [root@MYSQL src]#cp /usr/local/sbin/keepalived /usr/sbin/
2. 配置 keepalived
-
[root@MYSQL keepalived]# vi keepalived.conf
-
! Configuration File for keepalived
-
-
global_defs {
-
notification_email {
-
acassen@firewall.loc
-
failover@firewall.loc
-
sysadmin@firewall.loc
-
}
-
notification_email_from Alexandre.Cassen@firewall.loc
-
smtp_server 192.168.200.1
-
smtp_connect_timeout 30
-
router_id LVS_DEVEL ##配置是為了標識當前節點,兩個節點的此項設定可相同,也可不相同
-
}
-
-
vrrp_instance VI_1 {
-
state MASTER #指定A節點為主節點 備用節點上設定為BACKUP即可
-
interface eth0
-
virtual_router_id 51 #VRRP組名,兩個節點的設定必須一樣,以指明各個節點屬於同一VRRP
-
priority 100 #主節點的優先順序(1-254之間),備用節點必須比主節點優先順序低
-
advert_int 1
-
authentication { #設定驗證資訊,兩個節點必須一致
-
auth_type PASS
-
auth_pass 1111
-
}
-
virtual_ipaddress {
-
192.168.200.16 =====vip,指定一個vip
-
}
-
}
-
[root@MYSQL keepalived]# vi keepalived.conf
-
! Configuration File for keepalived
-
-
global_defs {
-
notification_email {
-
acassen@firewall.loc
-
failover@firewall.loc
-
sysadmin@firewall.loc
-
}
-
notification_email_from Alexandre.Cassen@firewall.loc
-
smtp_server 192.168.200.1
-
smtp_connect_timeout 30
-
router_id LVS_DEVEL ##配置是為了標識當前節點,兩個節點的此項設定可相同,也可不相同
-
}
-
-
vrrp_instance VI_1 {
- state MASTER #指定A節點為主節點 備用節點上設定為BACKUP即可
-
interface eth0
-
virtual_router_id 51 #VRRP組名,兩個節點的設定必須一樣,以指明各個節點屬於同一VRRP
-
priority 100 #主節點的優先順序(1-254之間),備用節點必須比主節點優先順序低
-
advert_int 1
-
authentication { #設定驗證資訊,兩個節點必須一致
-
auth_type PASS
-
auth_pass 1111
-
}
-
virtual_ipaddress {
-
192.168.200.16 =====vip,指定一個vip
-
}
- }
3. 檢測 mysql 服務指令碼
-
[root@MYSQL keepalived]# vi /root/check_mysql.sh
-
[root@node1 ~]# vi checkmysql.sh
#!/bin/bash
MYSQL=/usr/bin/mysql
MYSQL_HOST=127.0.0.1
MYSQL_USER=root
MYSQL_PASSWORD=ESBecs00
CHECK_TIME=3
MYSQL_OK=1
function check_mysql_helth (){
$MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p$MYSQL_PASSWORD -e "show status;" >/dev/null 2>&1
if [ $? = 0 ] ;then
MYSQL_OK=1
else
MYSQL_OK=0
fi
return $MYSQL_OK
}
#echo "$MYSQL_OK $CHECK_TIME"
while [ $CHECK_TIME -ne 0 ]
do
let "CHECK_TIME -= 1"
check_mysql_helth
if [ $MYSQL_OK = 1 ] ; then
echo "#### Mysql is alive ####"
CHECK_TIME=0
exit 0
fi
if [ $MYSQL_OK -eq 0 ] && [ $CHECK_TIME -eq 0 ]
then
#pkill keepalived
echo "#### Mysql Down ####"
echo "####now kill keepalive#####"
/etc/init.d/keepalived stop
exit 1
fi
sleep 1
done
-
-
[root@MYSQL keepalived]# vi /root/check_mysql.sh
-
[root@node1 ~]# vi checkmysql.sh
#!/bin/bash
MYSQL=/usr/bin/mysql
MYSQL_HOST=127.0.0.1
MYSQL_USER=root
MYSQL_PASSWORD=ESBecs00
CHECK_TIME=3
MYSQL_OK=1
function check_mysql_helth (){
$MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p$MYSQL_PASSWORD -e "show status;" >/dev/null 2>&1
if [ $? = 0 ] ;then
MYSQL_OK=1
else
MYSQL_OK=0
fi
return $MYSQL_OK
}
#echo "$MYSQL_OK $CHECK_TIME"
while [ $CHECK_TIME -ne 0 ]
do
let "CHECK_TIME -= 1"
check_mysql_helth
if [ $MYSQL_OK = 1 ] ; then
echo "#### Mysql is alive ####"
CHECK_TIME=0
exit 0
fi
if [ $MYSQL_OK -eq 0 ] && [ $CHECK_TIME -eq 0 ]
then
#pkill keepalived
echo "#### Mysql Down ####"
echo "####now kill keepalive#####"
/etc/init.d/keepalived stop
exit 1
fi
sleep 1
done
該指令碼做一個計劃任務每分鐘做一次檢查 MYSQL 服務如果掛掉了的話 VIP 就會在SALVE1.COM 上生效 這樣的話 client 端連線的 VIP 就會從 MYSQL.COM 上變到了SLAVE1.COM 上 此時 SLAVE1.COM 由於 MHA 的生效 已經變為主對外提供服務了 VIP 也在 SALVE1.COM 上 所以從頭到尾 客戶端只需要連線 VIP 就可以了
[root@MYSQL ~]# crontab -l */1 * * * * bash /root/checkmysql.sh
以上步驟再在下一次切換的 SLAVE 上做一次也就是 SLAVE1.COM 上做一次 keepalived.conf 配置檔案不同的地方已做註釋
4. 啟動測試
兩邊啟動 keepalived
[root@MYSQL keepalived]# keepalived -f /etc/keepalived/keepalived.conf
[root@MYSQL keepalived]# ps -ef |grep keep
root 3230 1 0 23:27 ? 00:00:00 keepalived -f /etc/keepalived/keepalived.conf root 3231 3230 0 23:27 ? 00:00:00 keepalived -f /etc/keepalived/keepalived.conf root 3232 3230 0 23:27 ? 00:00:00 keepalived -f /etc/keepalived/keepalived.conf root 3234 2538 0 23:27 pts/0 00:00:00 grep keep [root@MYSQL keepalived]# ip a 1: lo: mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:c9:85:ba brd ff:ff:ff:ff:ff:ff inet 192.168.186.141/24 brd 192.168.186.255 scope global eth0 inet 192.168.200.16/32 scope global eth0 inet6 fe80::20c:29ff:fec9:85ba/64 scope link valid_lft forever preferred_lft forever
測試關閉該主機的 MYSQL 服務看看 VIP 是否飄到配好 keepalive 的 SLAVE1.COM上 MHA 中的 MYSQL 主是不是也變成了 SLAVE1.COM
如是 至此全部完成。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。謝謝!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
一點注意事項:MHA時基於傳統的log + postion開發的,維護GTID肯定不靈光了,但是mha0.56已經開始支援gtid了,可以去作者的部落格看到原文地址:
參考文章:http://ylw6006.blog.51cto.com/470441/890360/
附上node與manager的下載地址
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29096438/viewspace-1818461/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- MySQL高可用之MHA切換測試(switchover & failover)MySqlAI
- 【DB寶19】MySQL高可用之MHA功能測試MySql
- 【DB寶19】在Docker中使用MySQL高可用之MHADockerMySql
- Mysql 5.7 MHA 高可用MySql
- Mysql MHA部署-04MHA配置MySql
- 基於 MHA 高可用的 MySQLMySql
- mysql高可用架構MHA搭建MySql架構
- MySQL MHA部署 Part 5 MHA部署指南MySql
- Mysql MHA部署-03MHA軟體安裝MySql
- MySQL 實現高可用架構之 MHAMySql架構
- MySQL高可用架構-MMM、MHA、MGR、PXCMySql架構
- Mysql 高可用(MHA)-讀寫分離(Atlas)MySql
- MySQL MHA部署 Part 6 MHA故障轉移測試MySql
- MHA+MySQL主從配置實現MySQL高可用MySql
- MHA高可用+VIP漂移
- MySQL MHA部署 Part 7 MHA手動切換測試MySql
- mysql5.7MHA配置MySql
- MySQL MHA部署實戰MySql
- mysql高可用架構MHA搭建(centos7+mysql5.7.28)MySql架構CentOS
- MySQL高可用架構之MHA 原理與實踐MySql架構
- MySQL——MHA高可用群集部署及故障測試MySql
- MySQL高可用群集MHA部署及故障測試分析MySql
- MySQL 高可用架構 - MHA環境部署記錄MySql架構
- 構建MHA實現MySQL高可用叢集架構MySql架構
- MySQL MHA詳細搭建過程MySql
- 【MySQL】Consul+MHA高考用方案MySql
- Mysql MHA部署-05故障轉移MySql
- MySQL MHA部署與測試-下篇MySql
- MHA高可用配置及故障切換
- 【MySQL】MHA的基本配置及註釋MySql
- Mysql MHA部署-02主從複製MySql
- Mysql MHA部署-06手動切換MySql
- Mysql MHA部署-07常見問題MySql
- MySQL MHA 執行狀態監控MySql
- MySQL MHA資訊的收集【Filebeat+logstash+MySQL】MySql
- MySQL高可用方案MHA線上切換的步驟及原理MySql
- MHA高可用架構的實現方式架構
- 部署MHA+keepalived+ProxySQL高可用架構SQL架構
- MySQL MHA工具的優缺點歸納MySql