一、啟動Heartbeat
1、啟動主節點的Heartbeat
Heartbeat安裝完成後,自動在/etc/init.d目錄下生成了啟動腳步檔案heartbeat,直接輸入/etc/init.d/heartbeat可以看到heartbeat指令碼的用法,如下所示:
- [root@node1 ~]# /etc/init.d/heartbeat
- Usage: /etc/init.d/heartbeat {start|stop|status|restart|reload|force-reload}
因而啟動heartbeat可以通過如下命令進行:
- [root@node1 ~]#service heartbeat start
- 或者通過
- [root@node1 ~]#/etc/init.d/heartbeat start
這樣就啟動了主節點的heartbeat服務
日誌資訊如下:
- Feb 5 19:09:48 node1 heartbeat: [22768]: info: glib: ucast: bound send socket to device: eth0
- Feb 5 19:09:48 node1 heartbeat: [22768]: info: glib: ucast: bound receive socket to device: eth0
- Feb 5 19:09:48 node1 heartbeat: [22768]: info: glib: ucast: started on port 694 interface eth0 to 192.168.12.1
- Feb 5 19:09:48 node1 heartbeat: [22768]: info: glib: ping heartbeat started.
- Feb 5 19:09:48 node1 heartbeat: [22768]: info: glib: ping group heartbeat started.
- Feb 5 19:09:48 node1 heartbeat: [22768]: info: Local status now set to: `up`
- Feb 5 19:09:49 node1 heartbeat: [22768]: info: Link 192.168.12.1:192.168.12.1 up.
- Feb 5 19:09:49 node1 heartbeat: [22768]: info: Status update for node 192.168.12.1: status ping
- Feb 5 19:09:49 node1 heartbeat: [22768]: info: Link group1:group1 up.
- Feb 5 19:09:49 node1 heartbeat: [22768]: info: Status update for node group1: status ping
此段日誌是Heartbeat在進行初始化配置,例如,heartbeat的心跳時間間隔、UDP廣播埠、ping節點的執行狀態等,日誌資訊到這裡會暫停,等待120秒之後,heartbeat會繼續輸出日誌,而這個120秒剛好是ha.cf中“initdead”選項的設定時間。此時heartbeat的輸出資訊如下:
- Feb 5 19:11:48 node1 heartbeat: [22768]: WARN: node node2: is dead
- Feb 5 19:11:48 node1 heartbeat: [22768]: info: Comm_now_up(): updating status to active
- Feb 5 19:11:48 node1 heartbeat: [22768]: info: Local status now set to: `active`
- Feb 5 19:11:48 node1 heartbeat: [22768]: info: Starting child client "/usr/local/ha/lib/heartbeat/pingd -m 100 -d 5s" (102,105)
- Feb 5 19:11:49 node1 heartbeat: [22768]: WARN: No STONITH device configured.
- Feb 5 19:11:49 node1 heartbeat: [22768]: WARN: Shared disks are not protected.
- Feb 5 19:11:49 node1 heartbeat: [22768]: info: Resources being acquired from node2.
- Feb 5 19:11:49 node1 heartbeat: [22794]: info: Starting "/usr/local/ha/lib/heartbeat/pingd -m 100 -d 5s" as uid 102 gid 105 (pid 22794)
在上面這段日誌中,由於node2還沒有啟動,所以會給出“node2: is dead”的警告資訊,接下來啟動了heartbeat外掛pingd,由於我們在ha.cf檔案中沒有配置STONITH,所以日誌裡也給出了“No STONITH device configured”的警告提示。
繼續看下面的日誌:
- Feb 5 19:11:50 node1 IPaddr[22966]: INFO: Resource is stopped
- Feb 5 19:11:50 node1 ResourceManager[22938]: info: Running /usr/local/ha/etc/ha.d/resource.d/IPaddr 192.168.12.135 start
- Feb 5 19:11:50 node1 IPaddr[23029]: INFO: Using calculated nic for 192.168.12.135: eth0
- Feb 5 19:11:50 node1 IPaddr[23029]: INFO: Using calculated netmask for 192.168.12.135: 255.255.255.0
- Feb 5 19:11:51 node1 pingd: [22794]: info: attrd_lazy_update: Connecting to cluster... 5 retries remaining
- Feb 5 19:11:51 node1 IPaddr[23029]: INFO: eval ifconfig eth0:0 192.168.12.135 netmask 255.255.255.0 broadcast 192.168.12.255
- Feb 5 19:11:51 node1 avahi-daemon[2455]: Registering new address record for 192.168.12.135 on eth0.
- Feb 5 19:11:51 node1 IPaddr[23015]: INFO: Success
- Feb 5 19:11:51 node1 Filesystem[23134]: INFO: Resource is stopped
- Feb 5 19:11:51 node1 ResourceManager[22938]: info: Running /usr/local/ha/etc/ha.d/resource.d/Filesystem /dev/sdf1 /data1 ext3 start
- Feb 5 19:11:52 node1 Filesystem[23213]: INFO: Running start for /dev/sdf1 on /data1
- Feb 5 19:11:52 node1 kernel: kjournald starting. Commit interval 5 seconds
- Feb 5 19:11:52 node1 kernel: EXT3 FS on sdf1, internal journal
- Feb 5 19:11:52 node1 kernel: EXT3-fs: mounted filesystem with ordered data mode.
- Feb 5 19:11:52 node1 Filesystem[23205]: INFO: Success
上面這段日誌是進行資源的監控和接管,主要完成haresources檔案中的設定,在這裡是啟用叢集虛擬IP和掛載磁碟分割槽
此時,通過ifconfig命令檢視主節點的網路配置,可以看到,主節點已經自動繫結了叢集的IP地址,在HA叢集之外的主機上通過ping命令檢測叢集IP地址192.168.12.135,已經處於可通狀態,也就是該地址變的可用。
同時檢視磁碟分割槽的掛載情況,共享磁碟分割槽/dev/sdf1已經被自動掛載。
2、啟動備用節點的Heartbeat
啟動備份節點的Heartbeat,與主節點方法一樣,使用如下命令:
- [root@node2 ~]#/etc/init.d/heartbeat start
- 或者執行
- [root@node2 ~]#service heartbeat start
這樣就啟動了備用節點的heartbeat服務,備用節點的heartbeat日誌輸出資訊與主節點相對應,通過“tail -f /var/log/messages”可以看到如下輸出:
- Feb 19 02:52:15 node2 heartbeat: [26880]: info: Pacemaker support: false
- Feb 19 02:52:15 node2 heartbeat: [26880]: info: **************************
- Feb 19 02:52:15 node2 heartbeat: [26880]: info: Configuration validated. Starting heartbeat 3.0.4
- Feb 19 02:52:15 node2 heartbeat: [26881]: info: heartbeat: version 3.0.4
- Feb 19 02:52:15 node2 heartbeat: [26881]: info: Heartbeat generation: 1297766398
- Feb 19 02:52:15 node2 heartbeat: [26881]: info: glib: UDP multicast heartbeat started for group 225.0.0.1 port 694 interface eth0 (ttl=1 loop=0)
- Feb 19 02:52:15 node2 heartbeat: [26881]: info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth0
- Feb 19 02:52:15 node2 heartbeat: [26881]: info: glib: ucast: bound send socket to device: eth0
- Feb 19 02:52:15 node2 heartbeat: [26881]: info: glib: ping heartbeat started.
- Feb 19 02:52:15 node2 heartbeat: [26881]: info: glib: ping group heartbeat started.
- Feb 19 02:52:15 node2 heartbeat: [26881]: info: Local status now set to: `up`
- Feb 19 02:52:16 node2 heartbeat: [26881]: info: Link node1:eth0 up.
- Feb 19 02:52:16 node2 heartbeat: [26881]: info: Status update for node node1: status active
- Feb 19 02:52:16 node2 heartbeat: [26881]: info: Link 192.168.12.1:192.168.12.1 up.
- Feb 19 02:52:16 node2 heartbeat: [26881]: info: Status update for node 192.168.12.1: status ping
- Feb 19 02:52:16 node2 heartbeat: [26881]: info: Link group1:group1 up.
- Feb 19 02:52:16 node2 harc[26894]: info: Running /usr/local/ha/etc/ha.d//rc.d/status status
- Feb 19 02:52:17 node2 heartbeat: [26881]: info: Comm_now_up(): updating status to active
- Feb 19 02:52:17 node2 heartbeat: [26881]: info: Local status now set to: `active`
二、測試heartbeat的高可用功能
如何才能得知HA叢集是否正常工作,模擬環境測試是個不錯的方法,在把Heartbeat高可用性叢集放到生產環境中之前,需要做如下幾個步驟的測試,從而確定HA是否正常工作:
(1)正常關閉和重啟主節點的heartbeat
首先在主節點node1上執行“service heartbeat stop”正常關閉主節點的Heartbeat程式,此時通過ifconfig命令檢視主節點網路卡資訊,正常情況下,應該可以看到主節點已經釋放了叢集的服務IP地址,同時也釋放了掛載的共享磁碟分割槽,然後檢視備份節點,現在備份節點已經接管了叢集的服務IP,同時也自動掛載上了共享的磁碟分割槽。
在這個過程中,使用ping命令對叢集服務IP進行測試,可以看到,叢集IP一致處於可通狀態,並沒有任何延時和阻塞現象,也就是說在正常關閉主節點的情況下,主備節點的切換是無縫的,HA對外提供的服務也可以不間斷執行。
接著,將主節點heartbeat正常啟動,heartbeat啟動後,備份節點將自動釋放叢集服務IP,同時解除安裝共享磁碟分割槽,而主節點將再次接管叢集服務IP和掛載共享磁碟分割槽,其實備份節點釋放資源與主節點繫結資源是同步進行的。因而,這個過程也是一個無縫切換。
(2)在主節點上拔去網線
拔去主節點連線公共網路的網線後,heartbeat外掛ipfail通過ping測試可以立刻檢測到網路連線失敗,接著自動釋放資源,而就在此時,備用節點的ipfail外掛也會檢測到主節點出現網路故障,在等待主節點釋放資源完畢後,備用節點馬上接管了叢集資源,從而保證了網路服務不間斷持續執行。
同理,當主節點網路恢復正常時,由於設定了“auto_failback on”選項,叢集資源將自動從備用節點切會主節點。
(3)關閉主節點的系統
在主節點拔去電源後,備用節點的heartbeat程式會立刻收到主節點已經shutdown的訊息,備用節點就開始進行資源的接管,這種情況其實和主節點網路故障的現象類似。
(4)讓主節點系統核心崩潰
當主節點系統崩潰後,網路也就失去了響應,那麼備用節點的heartbeat程式就會立刻檢測到主節點網路故障,然後進行資源切換,但是由於主節點系統核心崩潰,導致自身不能解除安裝所佔有的資源,例如共享磁碟分割槽、叢集服務IP等,那麼此時如果沒有類似Stonith裝置的話,就會出現資源爭用的情況,但是如果有Stonith裝置,Stonith裝置會首先將故障的主節點電源關閉或者重啟此節點等操作,這樣就讓主節點釋放了叢集資源,當Stonith裝置完成所有操作時,備份節點才拿到接管主節點資源的所有權,從而接管主節點的資源。
(完)
本專題相關內容:
Heartbeat3.x應用全攻略之:概念組成及工作原理http://ixdba.blog.51cto.com/2895551/745228
Heartbeat3.x應用全攻略之:安裝、配置、維護http://ixdba.blog.51cto.com/2895551/746271