架構簡介
前幾天網友來信說幫忙實現這樣一個架構:只有兩臺機器,需要實現其中一臺當機之後另一臺能接管這臺機器的服務,並且在兩臺機器正常服務時,兩臺機器都能用上。於是設計瞭如下的架構。
此架構主要是由keepalived實現雙機高可用,維護了一個外網VIP,一個內網VIP。正常情況時,外網VIP和內網VIP都繫結在server1伺服器,web請求傳送到server1的nginx,nginx對於靜態資源請求就直接在本機檢索並返回,對於php的動態請求,則負載均衡到server1和server2。對於SQL請求,會將此類請求傳送到Atlas MySQL中介軟體,Atlas接收到請求之後,把涉及寫操作的請求傳送到內網VIP,讀請求操作傳送到mysql從,這樣就實現了讀寫分離。
當主伺服器server1當機時,keepalived檢測到後,立即把外網VIP和內網VIP繫結到server2,並把server2的mysql切換成主庫。此時由於外網VIP已經轉移到了server2,web請求將傳送給server2的nginx。nginx檢測到server1當機,不再把請求轉發到server1的php-fpm。之後的sql請求照常傳送給本地的atlas,atlas把寫操作傳送給內網VIP,讀操作傳送給mysql從,由於內網VIP已經繫結到server2了,server2的mysql同時接受寫操作和讀操作。
當主伺服器server1恢復後,server1的mysql自動設定為從,與server2的mysql主同步。keepalived不搶佔server2的VIP,繼續正常服務。
架構要求
要實現此架構,需要三個條件:
- 1、伺服器可以設定內網IP,並且設定的內網IP互通;
- 2、伺服器可以隨意繫結IDC分配給我們使用的外網IP,即外網IP沒有繫結MAC地址;
- 3、MySQL伺服器支援GTID,即MySQL-5.6.5以上版本。
環境說明
server1
- eth0: 10.96.153.110(對外IP)
- eth1: 192.168.1.100(對內IP)
server2
- eth0: 10.96.153.114(對外IP)
- eth1: 192.168.1.101(對內IP)
系統都是CentOS-6。
對外VIP: 10.96.153.239
對內VIP: 192.168.1.150
hosts設定
/etc/hosts:
192.168.1.100 server1
192.168.1.101 server2
Nginx PHP MySQL Memcached安裝
這幾個軟體的安裝推薦使用EZHTTP來完成。
解決session共享問題
php預設的session儲存是在/tmp目錄下,現在我們是用兩臺伺服器作php請求的負載,這樣會造成session分佈在兩臺伺服器的/tmp目錄下,導致依賴於session的功能不正常。我們可以使用memcached來解決此問題。
上一步我們已經安裝好了memcached,現在只需要配置php.ini來使用memcached,配置如下,開啟php.ini配置檔案,修改為如下兩行的值:
1 2 |
session.save_handler = memcache session.save_path = "tcp://192.168.1.100:11211,tcp://192.168.1.101:11211" |
之後重啟php-fpm生效。
Nginx配置
- Server1配置
1234567891011121314151617181920http {[...]upstream php-server {server 192.168.1.101:9000;server 127.0.0.1:9000;keepalive 100;}[...]server {[...]location ~ \.php$ {fastcgi_pass php-server;fastcgi_index index.php;fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;include fastcgi_params;}[...]}[...]}
Server2配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
http { [...] upstream php-server { server 192.168.1.100:9000; server 127.0.0.1:9000; keepalive 100; } [...] server { [...] location ~ \.php$ { fastcgi_pass php-server; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } [...] } [...] } |
這兩個配置主要的作用是設定php請求的負載均衡。
MySQL配置
mysql util安裝
我們需要安裝mysql util裡的主從配置工具來實現主從切換。
1 2 3 4 5 6 |
cd /tmp wget http://dev.mysql.com/get/Downloads/MySQLGUITools/mysql-utilities-1.5.3.tar.gz tar xzf mysql-utilities-1.5.3.tar.gz cd mysql-utilities-1.5.3 python setup.py build python setup.py install |
mysql my.cnf配置
server1:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
[mysql] [...] protocol=tcp [...] [...] [mysqld] [...] # BINARY LOGGING # log-bin = /usr/local/mysql/data/mysql-bin expire-logs-days = 14 binlog-format= row log-slave-updates=true gtid-mode=on enforce-gtid-consistency =true master-info-repository=TABLE relay-log-info-repository=TABLE server-id=1 report-host=server1 report-port=3306 [...] |
server2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
[mysql] [...] protocol=tcp [...] [mysqld] [...] # BINARY LOGGING # log-bin = /usr/local/mysql/data/mysql-bin expire-logs-days = 14 binlog-format= row log-slave-updates=true gtid-mode=on enforce-gtid-consistency =true master-info-repository=TABLE relay-log-info-repository=TABLE server-id=2 report-host=server2 report-port=3306 [...] |
這兩個配置主要是設定了binlog和啟用gtid-mode,並且需要設定不同的server-id和report-host。
開放root帳號遠端許可權
我們需要在兩臺mysql伺服器設定root帳號遠端訪問許可權。
1 2 3 4 |
mysql> grant all on *.* to 'root'@'192.168.1.%' identified by 'Xp29at5F37' with grant option; mysql> grant all on *.* to 'root'@'server1' identified by 'Xp29at5F37' with grant option; mysql> grant all on *.* to 'root'@'server2' identified by 'Xp29at5F37' with grant option; mysql> flush privileges; |
設定mysql主從
在任意一臺執行如下命令:
1 |
mysqlreplicate --master=root:Xp29at5F37@server1:3306 --slave=root:Xp29at5F37@server2:3306 --rpl-user=rpl:o67DhtaW |
# master on server1: … connected.
# slave on server2: … connected.
# Checking for binary logging on master…
# Setting up replication…
# …done.
顯示主從關係
1 |
mysqlrplshow --master=root:Xp29at5F37@server1 --discover-slaves-login=root:Xp29at5F37 |
# master on server1: … connected.
# Finding slaves for master: server1:3306
# Replication Topology Graph
server1:3306 (MASTER)
|
+— server2:3306 – (SLAVE)
檢查主從狀態
1 |
mysqlrplcheck --master=root:Xp29at5F37@server1 --slave=root:Xp29at5F37@server2 |
# master on server1: … connected.
# slave on server2: … connected.
Test Description Status
—————————————————————————
Checking for binary logging on master [pass]
Are there binlog exceptions? [pass]
Replication user exists? [pass]
Checking server_id values [pass]
Checking server_uuid values [pass]
Is slave connected to master? [pass]
Check master information file [pass]
Checking InnoDB compatibility [pass]
Checking storage engines compatibility [pass]
Checking lower_case_table_names settings [pass]
Checking slave delay (seconds behind master) [pass]
# …done.
Keepalived配置
keepalived安裝(兩臺都裝)
- yum -y install keepalived
- chkconfig keepalived on
keepalived配置(server1)
- vi /etc/keepalived/keepalived.conf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
vrrp_sync_group VG_1 { group { inside_network outside_network } } vrrp_instance inside_network { state BACKUP interface eth1 virtual_router_id 51 priority 101 advert_int 1 authentication { auth_type PASS auth_pass 3489 } virtual_ipaddress { 192.168.1.150/24 } nopreempt notify /data/sh/mysqlfailover-server1.sh } vrrp_instance outside_network { state BACKUP interface eth0 virtual_router_id 50 priority 101 advert_int 1 authentication { auth_type PASS auth_pass 3489 } virtual_ipaddress { 10.96.153.239/24 } nopreempt } |
keepalived配置(server2)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
rrp_sync_group VG_1 { group { inside_network outside_network } } vrrp_instance inside_network { state BACKUP interface eth1 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 3489 } virtual_ipaddress { 192.168.1.150 } notify /data/sh/mysqlfailover-server2.sh } vrrp_instance outside_network { state BACKUP interface eth0 virtual_router_id 50 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 3489 } virtual_ipaddress { 10.96.153.239/24 } } |
此keepalived配置需要注意的是:
- 1、兩臺server的state都設定為backup,server1增加nopreempt配置,並且server1 priority比server2高,這樣用來實現當server1從當機恢復時,不搶佔VIP;
- 2、server1設定notify /data/sh/mysqlfailover-server1.sh,server2設定notify /data/sh/mysqlfailover-server2.sh,作用是自動切換主從
/data/sh/mysqlfailover-server1.sh指令碼內容:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
#!/bin/bash sleep 10 state=$3 result=`mysql -h127.0.0.1 -P3306 -uroot -pXp29at5F37 -e 'show slave status;'` [[ "$result" == "" ]] && mysqlState="master" || mysqlState="slave" if [[ "$state" == "MASTER" ]];then if [[ "$mysqlState" == "slave" ]];then mysqlrpladmin --slave=root:Xp29at5F37@server1:3306 failover fi elif [[ "$state" == "BACKUP" ]];then if [[ "$mysqlState" == "master" ]];then mysqlreplicate --master=root:Xp29at5F37@server2:3306 --slave=root:Xp29at5F37@server1:3306 --rpl-user=rpl:o67DhtaW fi fi sed -i 's/proxy-read-only-backend-addresses.*/proxy-read-only-backend-addresses = 192.168.1.150:3306/' /usr/local/mysql-proxy/conf/my.cnf mysql -h127.0.0.1 -P2345 -uuser -ppwd -e "REMOVE BACKEND 2;" |
/data/sh/mysqlfailover-server2.sh指令碼內容:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
#!/bin/bash sleep 10 state=$3 result=`mysql -h127.0.0.1 -P3306 -uroot -pXp29at5F37 -e 'show slave status;'` [[ "$result" == "" ]] && mysqlState="master" || mysqlState="slave" if [[ "$state" == "MASTER" ]];then if [[ "$mysqlState" == "slave" ]];then mysqlrpladmin --slave=root:Xp29at5F37@server2:3306 failover fi elif [[ "$state" == "BACKUP" ]];then if [[ "$mysqlState" == "master" ]];then mysqlreplicate --master=root:Xp29at5F37@server1:3306 --slave=root:Xp29at5F37@server2:3306 --rpl-user=rpl:o67DhtaW fi fi sed -i 's/proxy-read-only-backend-addresses.*/proxy-read-only-backend-addresses = 192.168.1.150:3306/' /usr/local/mysql-proxy/conf/my.cnf mysql -h127.0.0.1 -P2345 -uuser -ppwd -e "REMOVE BACKEND 2;" |
Atlas設定
atlas安裝
到這裡下載最新版本,https://github.com/Qihoo360/Atlas/releases
1 2 3 |
cd /tmp wget https://github.com/Qihoo360/Atlas/releases/download/2.2.1/Atlas-2.2.1.el6.x86_64.rpm rpm -i Atlas-2.2.1.el6.x86_64.rpm |
atlas配置
1 2 3 |
cd /usr/local/mysql-proxy/conf cp test.cnf my.cnf vi my.cnf |
調整如下引數,
1 2 3 4 |
proxy-backend-addresses = 192.168.1.150:3306 proxy-read-only-backend-addresses = 192.168.1.101:3306 pwds = root:qtyU1btXOo074Itvx0UR9Q== event-threads = 8 |
注意:
proxy-backend-addresse設定為內網VIP
proxy-read-only-backend-addresses設定為server2的IP
root:qtyU1btXOo074Itvx0UR9Q==設定資料庫的使用者和密碼,密碼是通過/usr/local/mysql-proxy/bin/encrypt Xp29at5F37生成。
更詳細引數解釋請檢視,Atlas配置詳解。
啟動atlas
- /usr/local/mysql-proxy/bin/mysql-proxy –defaults-file=/usr/local/mysql-proxy/conf/my.cnf
之後程式裡配置mysql就配置127.0.0.1:1234就好。
部署atlas自動維護指令碼
在兩臺機器都部署此指令碼,並新增定時任務(如每2分鐘執行一次)我們把指令碼放在/data/sh/auto_maintain_atlas.sh,指令碼內容為:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
#!/bin/bash count=`mysql -N -h127.0.0.1 -P2345 -uuser -ppwd -e "select * from backends;" | wc -l` if [[ "$count" == "1" ]];then result=`mysql -hserver1 -P3306 -uroot -pXp29at5F37 -e 'show slave status\G'` if echo "$result" | grep Slave_IO_State;then slaveIP=192.168.1.100 else result=`mysql -hserver2 -P3306 -uroot -pXp29at5F37 -e 'show slave status\G'` slaveIP=192.168.1.101 fi slaveIORunning=`echo "$result" | awk -F':' '/Slave_IO_Running:/{print $2}'` slaveSQLRunning=`echo "$result" | awk -F':' '/Slave_SQL_Running:/{print $2}'` SlaveSQLRunning_State=`echo "$result" | awk -F':' '/Slave_SQL_Running_State:/{print $2}'` if [[ "$slaveIORunning" =~ "Yes" && "$slaveSQLRunning" =~ "Yes" && "$SlaveSQLRunning_State" =~ "Slave has read all relay log" ]];then mysql -h127.0.0.1 -P2345 -uuser -ppwd -e "add slave ${slaveIP}:3306;" fi fi |
為什麼需要這個指令碼呢?假設目前mysql主伺服器在s1,s1當機後,s2接管VIP,接著刪除atlas中設定的slave backend,其mysql提升為主。過一段時間後,s1從當機中恢復,這時候s1的mysql自動切換為從,接著刪除atlas中設定的slave backend,開始連線s2的mysql主同步資料。到這個時候我們發現,已經不存在讀寫分離了,所有的sql都傳送給了s2的mysql。auto_maintain_atlas.sh指令碼就派上用場了,此指令碼會定時的檢查主從是否已經同步完成,如果完成就自動增加slave backend,這樣讀寫分離又恢復了,完全不需要人工干預。
server1主當機測試
測試keepalived是否工作正常
我們來模擬server1當機。
在server1上執行shutdown關機命令。
此時我們登入server2,執行ip addr命令,輸出如下:
1: lo: mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:81:9d:42 brd ff:ff:ff:ff:ff:ff
inet 10.96.153.114/24 brd 10.96.153.255 scope global eth0
inet 10.96.153.239/24 scope global secondary eth0
inet6 fe80::20c:29ff:fe81:9d42/64 scope link
valid_lft forever preferred_lft forever
3: eth1: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:81:9d:4c brd ff:ff:ff:ff:ff:ff
inet 192.168.1.101/24 brd 192.168.1.255 scope global eth1
inet 192.168.1.150/32 scope global eth1
inet6 fe80::20c:29ff:fe81:9d4c/64 scope link
valid_lft forever preferred_lft forever
我們看到對外VIP 10.96.153.239和對內IP 192.168.1.150已經轉移到server2了,證明keepalived執行正常。
測試是否自動切換了主從
登入server2的mysql伺服器,執行show slave status;命令,如下:
mysql> show slave statusG
Empty set (0.00 sec)
我們發現從狀態已經為空,證明已經切換為主了。
測試server1是否搶佔VIP
為什麼要測試這個呢?如果server1恢復之後搶佔了VIP,而我們的Atlas裡後端設定的是VIP,這樣server1啟動之後,sql的寫操作就會向server1的mysql傳送,而server1的mysql資料是舊於server2的,所以這樣會造成資料不一致,這個是非常重要的測試。
我們先來啟動server1,之後執行ip addr,輸出如下:
1: lo: mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:f1:4f:4e brd ff:ff:ff:ff:ff:ff
inet 10.96.153.110/24 brd 10.96.153.255 scope global eth0
inet6 fe80::20c:29ff:fef1:4f4e/64 scope link
valid_lft forever preferred_lft forever
3: eth1: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:f1:4f:58 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.100/24 brd 192.168.1.255 scope global eth1
inet6 fe80::20c:29ff:fef1:4f58/64 scope link
valid_lft forever preferred_lft forever
我們看到,server1並沒有搶佔VIP,測試正常。不過另人鬱悶的是,在虛擬機器的環境並沒有測試成功,不知道為什麼。
測試server2的atlas是否已經刪除slave backend
我們測試這個是為了保證atlas已經沒有slave backend,也就是沒有從庫的設定了,否則當server1恢復時,有可能會把讀請求傳送給server1的mysql,造成讀取了舊資料的問題。
[root@server1 ~]# mysql -h127.0.0.1 -P2345 -uuser -ppwd
mysql> select * from backends;
+————-+——————–+——-+——+
| backend_ndx | address | state | type |
+————-+——————–+——-+——+
| 1 | 192.168.1.150:3306 | up | rw |
+————-+——————–+——-+——+
1 rows in set (0.00 sec)
如果看到只有一個後端,證明運作正常。
測試server1 mysql是否設定為從
serve1恢復後,登入server1的mysql伺服器,執行show slave status;命令,如下:
mysql> show slave statusG
*************************** 1. row ***************************
Slave_IO_State: Opening tables
Master_Host: server1
Master_User: rpl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000015
Read_Master_Log_Pos: 48405991
Relay_Log_File: mysql-relay-bin.000002
Relay_Log_Pos: 361
Relay_Master_Log_File: mysql-bin.000015
Slave_IO_Running: Yes
Slave_SQL_Running: yes
測試是否自動恢復讀寫分離
server1恢復後一段時間,我們可以看是讀寫分離是否已經恢復。
[root@server1 ~]# mysql -h127.0.0.1 -P2345 -uuser -ppwd
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or g.
Your MySQL connection id is 1
Server version: 5.0.99-agent-admin
Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type ‘help;’ or ‘h’ for help. Type ‘c’ to clear the current input statement.
mysql> select * from backends;
+————-+——————–+——-+——+
| backend_ndx | address | state | type |
+————-+——————–+——-+——+
| 1 | 192.168.1.150:3306 | up | rw |
| 2 | 192.168.1.100:3306 | up | ro |
+————-+——————–+——-+——+
2 rows in set (0.00 sec)
我們看到server1已經被新增為slave backend了。這表示已經成功恢復讀寫分離。