為了測試一個環境,需要在Azure上搭建高可用的LAMP架構。但要求MySQL的中介軟體Atlas採用主備的模式。在資料中心一般採用Keepalive+VIP的模式,通過浮動地址對外提供服務。
但在雲環境中,不能支援浮動地址,也不支援A/S的負載均衡模式。於是考慮採用ILB+HAProxy的模式,由ILB模擬VIP地址,HAProxy負責A/S的負載均衡。
-
整體框架
採用全冗餘的架構設計。每層都是雙機。主機都是採用CentOS6.5的作業系統。
-
MySQL的安裝與配置
MySQL採用主主的配置方式。兩臺裝置上都敲入下面的配置和命令:yum install -y mysql-server
chekconfig mysqld on;service mysqld start
iptables -F
setenforce 0
service iptables save更改root密碼:
/usr/bin/mysqladmin -u root password "newpass"使用root登陸
mysql -h127.0.0.1 -uroot -ppassword建立資料庫:
create database mytable;建立使用者,兩臺建立相同的使用者:
GRANT ALL ON php.* to 'user'@'%' IDENTIFIED BY 'password';
FLUSH PRIVILEGES;嘗試建立表和插入資料,兩臺伺服器插入不同的內容:
use mytable;
create table mytest(name varchar(20), phone char(14));
insert into mytest(name, phone) values('wang', 11111111111);
select * from mytest;配置主-主:
配置/etc/my.cnf檔案:
主機一 |
主機二 |
server-id = 1 |
server-id = 2 |
log_bin=mysqlbinlog |
log_bin=mysqlbinlog |
log_bin_index=mysqlbinlog-index |
log_bin_index=mysqlbinlog-index |
log_slave_updates=1 |
log_slave_updates=1 |
relay_log=relay-log |
relay_log=relay-log |
replicate_do_db=test |
replicate_do_db=test |
binlog-do-db = test |
binlog-do-db = test |
binlog-ignore-db=mysql |
binlog-ignore-db=mysql |
log-slave-updates |
log-slave-updates |
sync_binlog=1 |
sync_binlog=1 |
auto_increment_offset=1 |
auto_increment_offset=2 |
auto_increment_increment=2 |
auto_increment_increment=2 |
replicate-ignore-db= mysql |
replicate-ignore-db= mysql |
配置完成後,重新啟動mysql: service mysqld restart
在兩臺主機中觀察:
show master status;
File |
Position |
Binlog_Do_DB |
Binlog_Ignore_DB |
mysqlbinlog.000001 |
325 |
test |
mysql |
show slave status\G
此時Slave_IO_Running、Slave_SQL_Running狀態是No的狀態:
mysql> show slave status\G
***************** 1. row *****************
Slave_IO_State: Waiting for master to send event
Master_Host: 172.16.4.5
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysqlbinlog.000001
Read_Master_Log_Pos: xxxx
Relay_Log_File: relay-log.0000xx
Relay_Log_Pos: 253
Relay_Master_Log_File: mysqlbinlog.000001
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB: test
Replicate_Ignore_DB: mysql
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: xxxx
Relay_Log_Space: xxx
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.00 sec)
在兩臺機器上建立複製使用者:
GRANT REPLICATION SLAVE ON . TO 'slave'@'%' IDENTIFIED BY 'password';
執行下面的命令實現主-主:
172.16.4.4上執行:
stop slave;
CHANGE MASTER TO MASTER_HOST='172.16.4.5', MASTER_USER='slave', MASTER_PASSWORD="password, MASTER_LOG_FILE='mysqlbinlog.000001', MASTER_LOG_POS=325;
start slvae;
在172.16.4.5上執行:
stop slave;
CHANGE MASTER TO MASTER_HOST='172.16.4.4', MASTER_USER='slave', MASTER_PASSWORD='password', MASTER_LOG_FILE='mysqlbinlog.000001', MASTER_LOG_POS=325;
start slave;
此時show slave status\G中的Slave_IO_Running、Slave_SQL_Running狀態是yes、yes狀態。此時主-主就做成功了。
3 安裝Atlas
兩臺安裝配置相同:
從Github上下載Atlas:
https://github.com/Qihoo360/Atlas/releases
選擇相應的版本,我的機器是CentOS6.5,所以我選擇Atlas-2.2.1.el6.x86_64.rpm。
wget https://github.com/Qihoo360/Atlas/releases/download/2.2.1/Atlas-2.2.1.el6.x86_64.rpm
發現是存放在AWS的S3上。
安裝:rpm -ivh Atlas-2.2.1.el6.x86_64.rpm
修改配置檔案: /usr/local/mysql-proxy/conf/test.cnf
#帶#號的為非必需的配置專案
#管理介面的使用者名稱
admin-username = user
#管理介面的密碼
admin-password = pwd
#Atlas後端連線的MySQL主庫的IP和埠,可設定多項,用逗號分隔
proxy-backend-addresses = 172.16.4.4:3306
#Atlas後端連線的MySQL從庫的IP和埠,@後面的數字代表權重,用來作負載均衡,若省略則預設為1,可設定多項,用逗號分隔
proxy-read-only-backend-addresses = 172.16.4.5:3306@1
#使用者名稱與其對應的加密過的MySQL密碼,密碼使用PREFIX/bin目錄下的加密程式encrypt加密,下行的user1和user2為示例,將其替換為你的MySQL的使用者名稱和加密密碼!
pwds = slave:euRQ8nFxoVUtoVZBPiOC6Q==
#設定Atlas的執行方式,設為true時為守護程式方式,設為false時為前臺方式,一般開發除錯時設為false,線上執行時設為true,true後面不能有空格。
daemon = true
#設定Atlas的執行方式,設為true時Atlas會啟動兩個程式,一個為monitor,一個為worker,monitor在worker意外退出後會自動將其重啟,設為false時只有worker,沒有monitor,一般開發除錯時設為false,線上執行時設為true,true後面不能有空格。
keepalive = false
#工作執行緒數,對Atlas的效能有很大影響,可根據情況適當設定
event-threads = 1
#日誌級別,分為message、warning、critical、error、debug五個級別
log-level = message
#日誌存放的路徑
log-path = /usr/local/mysql-proxy/log
#SQL日誌的開關,可設定為OFF、ON、REALTIME,OFF代表不記錄SQL日誌,ON代表記錄SQL日誌,REALTIME代表記錄SQL日誌且實時寫入磁碟,預設為OFF
#sql-log = OFF
#慢日誌輸出設定。當設定了該引數時,則日誌只輸出執行時間超過sql-log-slow(單位:ms)的日誌記錄。不設定該引數則輸出全部日誌。
#sql-log-slow = 10
#例項名稱,用於同一臺機器上多個Atlas例項間的區分
#instance = test
#Atlas監聽的工作介面IP和埠
proxy-address = 0.0.0.0:3306
#Atlas監聽的管理介面IP和埠
admin-address = 0.0.0.0:2345
#分表設定,此例中person為庫名,mt為表名,id為分表欄位,3為子表數量,可設定多項,以逗號分隔,若不分表則不需要設定該項
#tables = person.mt.id.3
#預設字符集,設定該項後客戶端不再需要執行SET NAMES語句
#charset = utf8
#允許連線Atlas的客戶端的IP,可以是精確IP,也可以是IP段,以逗號分隔,若不設定該項則允許所有IP連線,否則只允許列表中的IP連線
#client-ips = 127.0.0.1, 192.168.1
#Atlas前面掛接的LVS的物理網路卡的IP(注意不是虛IP),若有LVS且設定了client-ips則此項必須設定,否則可以不設定
#lvs-ips = 192.168.1.1
需要注意的是,mysql的密碼需要經過/usr/local/mysql-proxy/bin/encrypt 程式進行加密: ./encrypt password
製作啟動程式:
vim /etc/init.d/atlas
#!/bin/sh
#
#atlas: Atlas Daemon
#
# chkconfig: - 90 25
# description: Atlas Daemon
#
# Source function library.
start()
{
echo -n $"Starting atlas: "
/usr/local/mysql-proxy/bin/mysql-proxyd test start
echo
}
stop()
{
echo -n $"Shutting down atlas: "
/usr/local/mysql-proxy/bin/mysql-proxyd test stop
echo
}
ATLAS="/usr/local/mysql-proxy/bin/mysql-proxyd"
[ -f $ATLAS ] || exit 1
# See how we were called.
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
stop
sleep 3
start
;;
*)
echo $"Usage: $0 {start|stop|restart}"
exit 1
esac
exit 0
chmod a+x atlas
chkconfig atlas on; service atlas start
檢查是否已經開始監聽埠:
netstat -tunlp
看到3306埠已經在listen的狀態,說明atlas已經開始工作了。
4 安裝HAProxy
兩臺配置相同:
yum install haproxy -y
chkconfig haproxy on
修改haproxy的配置檔案:vim /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Example configuration for a possible web application. See the
# full configuration options online.
#
# http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
#
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
#
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode tcp
log global
option dontlognull
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend main *:3306
mode tcp
default_backend nodes
#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
#backend static
# balance roundrobin
# server static 127.0.0.1:4331 check
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend nodes
mode tcp
balance roundrobin
server app1 172.16.4.4:3306 check
server app2 172.16.4.5:3306 backup
最後的backup表明這臺伺服器是備份狀態。
還可以配置執行狀態監控,我在這裡沒有配置,哪位有興趣可以自己加上。
5 Azure的ILB
Azure的ILB只能採用PowerShell配置。具體的命令是:
Add-AzureInternalLoadBalancer -InternalLoadBalancerName MyHAILB -SubnetName Subnet-2 -ServiceName atlasha01
get-AzureVM -ServiceName atlasha01 -Name atlasha01 | Add-AzureEndpoint -Name mysql -LBSetName mysqlha -Protocol tcp -LocalPort 3306 -PublicPort 3306 -ProbePort 3306 -ProbeProtocol tcp -ProbeIntervalInSeconds 10 -InternalLoadBalancerName MyHAILB | Update-AzureVM
get-AzureVM -ServiceName atlasha01 -Name atlasha02 | Add-AzureEndpoint -Name mysql -LBSetName mysqlha -Protocol tcp -LocalPort 3306 -PublicPort 3306 -ProbePort 3306 -ProbeProtocol tcp -ProbeIntervalInSeconds 10 -InternalLoadBalancerName MyHAILB | Update-AzureVM
PS C:> Get-AzureInternalLoadBalancer -servicename atlasha01
InternalLoadBalancerName : MyHAILB
ServiceName : atlasha01
DeploymentName : atlasha01
SubnetName : Subnet-2
IPAddress : 172.16.2.6
OperationDescription : Get-AzureInternalLoadBalancer
OperationId : cd86e37a-776c-4fc8-8e31-7bad6ebc88b7
OperationStatus : Succeeded
已經把兩臺HAProxy的主機加入到ILB的負載均衡中。
此時ILB的虛擬浮動地址是172.16.2.6
6 安裝前端的Web伺服器
我安裝的是phpBB3,具體的安裝方法請參考我的另外一篇部落格:http://www.cnblogs.com/hengwei/p/4754408.html
需要注意的是,在安裝phpBB3的安裝過程中,需要輸入MySQL提供服務的IP地址,此時需要填寫ILB的地址:172.16.2.6
在一條主機上安裝完成後,把網站內容複製到另外一臺主機:
rsync -a /var/www/html/phpBB3 172.16.1.5:/var/www/html/phpBB3
當然要事先建好目錄,另外要有root的許可權和密碼。
7 Azure的SLB
在配置Aure的主機時,Web的兩臺伺服器要求配置到一個SLB中,這個在圖形化介面裡就可以操作了,我就不多描述了。配置好SLB後,需要配置SLB的工作模式,把它調整成source IP的hash模式。這樣,可以保證同一客戶端的請求總是訪問同一臺Web伺服器。
這個工作需要通過PowerShell實現:
Set-AzureLoadBalanceEndpoint -LBSetName httpset -LoadBalancerDistribution sourceIP -ServiceName atlasweb
8 新增iptables, 和探測指令碼,實現故障自動切換
主用的MySQL伺服器出現故障,MySQL的服務會遷移到備用的伺服器上。但當主用的伺服器恢復後,HAProxy會把SQL的請求重新發回給主用的伺服器。由於主用伺服器下線過程中,備用MySQL的資料庫會有資料更新,主用MySQL要從備用MySQL上同步資料,要同步後才能對外提供服務。
所以要在主用伺服器啟動指令碼rc.local中新增iptables,只允許備用伺服器訪問主用MySQL伺服器,並在載入了iptables後再啟動mysql服務:
iptables -A INPUT -j ACCEPT -s 10.1.0.9/32 iptables -A INPUT -j DROP sleep 20 service mysqld start
主用伺服器起來後,將與備用伺服器同步MySQL的內容。此時,主用伺服器需要監控備用伺服器的狀態,一旦備用伺服器出現down機的情況,需要接管MySQL的服務。指令碼如下:
#!/bin/sh #list iptables, result to file fw /sbin/iptables -L > fw #from file fw, grep key word "10.1.0" grep 10.1.0 > fwRus #if the fw include 10.1.0, means the fw has contents, the my2 is working, otherwise my1 is working if [ `wc -l fwRus |awk '{print $1}'` = 0 ] then echo $(TZ=Asia/Shanghai date "+%Y-%m-%d-%H-%M-%S") >> a echo "do nothing" >> a else { #fw has content, then to detect the my2's status, up or down ping -c 1 my2 > res grep ttl res > pingRus #if the my2 is up, do nothing, is down, remove fw, let my1 be active if [ `wc -l pingRus |awk '{print $1}'` = 0 ] then echo "del firewall" >> b echo $(TZ=Asia/Shanghai date "+%H-%M-%S-%Y-%m-%d") >> b /sbin/iptables -F else echo "host $ip is online" >> c echo $(TZ=Asia/Shanghai date "+%Y-%m-%d-%H-%M-%S") >> c fi } fi
在主用MySQL和備用MySQL同步後,需要運維人員手工關閉這條iptables的設定。使前端的服務可以通過ILB+HAProxy訪問到主用MySQL。
根據和客戶的交流,採用PING的形式檢測第二臺MySQL伺服器的方式不是特別合理。故又採用檢查MySQL間資料同步的狀態,作為檢測機制。其腳步如下:
#!/bin/sh function mysqlCheck() { time1=`date +"%Y%m%d%H%M%S"` time2=`date +"%Y-%m-%d %H:%M:%S"` CheckFile="/tmp/MySQL.${time1}" Flag=0 echo "----------Begin at: " $time2 "------------" > $CheckFile 2>&1 mysql -uroot -pcisco -e "show slave status\G" >> $CheckFile 2>&1 echo "" >> $CheckFile BM=`grep Seconds_Behind_Master $CheckFile | awk '{print $2}'` SIOR=`grep Slave_IO_Running $CheckFile | awk '{print $2}'` SSQLR=`grep Slave_SQL_Running $CheckFile | awk '{print $2}'` [ "$BM" == '0' -a "$SIOR" == 'Yes' -a "$SSQLR" == "Yes" ] && Flag=1 || FLag=0 return $Flag } #list iptables, result to file fw /sbin/iptables -L > fw #from file fw, grep key word "10.1.0" grep DROP fw > fwRus #if the fw include 10.1.0, means the fw has contents, the my2 is working, otherwise my1 is working if [ `wc -l fwRus |awk '{print $1}'` = 0 ] then echo $(TZ=Asia/Shanghai date "+%Y-%m-%d-%H-%M-%S") >> a echo "do nothing" >> a else { mysqlCheck if [ $Flag = 1 ] then echo "add my2 firewall, del my1 firewall" >> $CheckFile 2>&1 echo $(TZ=Asia/Shanghai date "+%H-%M-%S-%Y-%m-%d") >> $CheckFile 2>&1 ssh my2 "iptables -A INPUT -j ACCEPT -s 10.1.0.8/32" ssh my2 "iptables -A INPUT -j ACCEPT -s 10.1.0.4/32" ssh my2 "iptables -A INPUT -j ACCEPT -s 10.1.0.5/32" ssh my2 "iptables -A INPUT -j DROP" /sbin/iptables -F exit else echo "MySQL is not sync" >> $CheckFile 2>&1 echo $(TZ=Asia/Shanghai date "+%Y-%m-%d-%H-%M-%S") >> $CheckFile 2>&1 fi } fi
同樣,此腳步需要載入到crontab裡。
在第二臺MySQL的伺服器上,同樣要在各種情況下判斷,是否需要接管MySQL服務。其腳步為:
#!/bin/bash function mysqlCheck() { time1=`date +"%Y%m%d%H%M%S"` time2=`date +"%Y-%m-%d %H:%M:%S"` CheckFile="/tmp/MySQL.CheckFile" Flag=0 echo "----------Begin at: " $time2 "------------" >> $CheckFile 2>&1 echo "" >> $CheckFile BM=`mysql -uroot -pcisco -hmy1 -e "show slave status\G"|grep Seconds_Behind_Master|awk '{print $2}'` SIOR=`mysql -uroot -pcisco -hmy1 -e "show slave status\G"|grep Slave_IO_Running|awk '{print $2}'` SSQLR=`mysql -uroot -pcisco -hmy1 -e "show slave status\G"|grep Slave_SQL_Running |awk '{print $2}'` [ "$BM" == '0' -a "$SIOR" == 'Yes' -a "$SSQLR" == "Yes" ] && Flag=1 || FLag=0 return $Flag } function my2sqlCheck() { time1=`date +"%Y%m%d%H%M%S"` time2=`date +"%Y-%m-%d %H:%M:%S"` CheckFile="/tmp/MySQL.CheckFile" Flag=0 echo "----------Begin at: " $time2 "------------" >> $CheckFile 2>&1 echo "" >> $CheckFile BM=`mysql -uroot -pcisco -e "show slave status\G"|grep Seconds_Behind_Master|awk '{print $2}'` SIOR=`mysql -uroot -pcisco -e "show slave status\G"|grep Slave_IO_Running|awk '{print $2}'` SSQLR=`mysql -uroot -pcisco -e "show slave status\G"|grep Slave_SQL_Running |awk '{print $2}'` [ "$BM" == '0' -a "$SIOR" == 'Yes' -a "$SSQLR" == "Yes" ] && my2Flag=1 || my2FLag=0 return $my2Flag } while true do my1fw=`ssh my1 "iptables -L" | grep DROP| wc -l` my2fw=`iptables -L | grep DROP| wc -l` mysqlCheck if [ "$my1fw" = '0' -a "$Flag" = '1' ];then if [ "$my2fw" -ge '1' ];then echo "my1 mysql service is ok, my2 fw is on, do nothing" >> $CheckFile 2>&1 else iptables -A INPUT -j ACCEPT -s 10.1.0.8/32 iptables -A INPUT -j ACCEPT -s 10.1.0.4/32 iptables -A INPUT -j ACCEPT -s 10.1.0.5/32 iptables -A INPUT -j DROP echo "my1 mysql service is ok, my2 fw is off, add firewall" >> $CheckFile 2>&1 fi else if [ "$my2fw" = '0' ];then echo "my1 mysql service is not ok, my2 fw is off, my2 provide mysql service, do nothing" >> $CheckFile 2>&1 else LastFlag=`cat /tmp/LastStatus` if [ "$LastFlag" = '1' ];then iptables -F echo "my1 mysql service is not ok, my2 fw is on, delete firewall" >> $CheckFile 2>&1 else echo "my1 mysql service is off, but my2 database not sync, my2 can not provide mysql service now, do nothing" >> $CheckFile 2>&1 fi fi fi my2sqlCheck echo $my2Flag > /tmp/LastStatus sleep 60 done
此指令碼做成守護程式,每一分鐘執行一次。在rc.local中載入。
以上腳步僅供參考,在實際生產環境中,應該採用掛維護頁面,中斷資料庫操作的情況下進行切換。
總結:
至此,所有的配置工作全部完成。