系統版本：CentOS 6.3 x64（核心2.6.32）

DRBD： DRBD-8.4.3

HeartBeat：epel更新源（真坑）

NFS: 系統自帶

HeartBeat VIP: 192.168.7.90

Primary DRBD+HeartBeat: 192.168.7.88（drbd1.example.com）

Secondary DRBD+HeartBeat: 192.168.7.89 (drbd2.example.com)

(Primary)為僅主伺服器端配置

(Secondary)為僅從伺服器端配置

(Primary,Secondary)為主伺服器端從伺服器端共同配置

一.DRBD配置，傳送門：http://showerlee.blog.51cto.com/2047005/1211963

二.Hearbeat配置；

這裡接著DRBD系統環境及安裝配置：

1.安裝heartbeat（CentOS6.3中預設不帶有Heartbeat包，因此需要從第三方下載）(Primary,Secondary)

# wget ftp://mirror.switch.ch/pool/1/mirror/scientificlinux/6rolling/i386/os/Packages/epel-release-6-5.noarch.rpm

# rpm -ivUh epel-release-6-5.noarch.rpm

# yum --enablerepo=epel install heartbeat -y

2.配置heartbeat

(Primary）

# vim /etc/ha.d/ha.cf

---------------

# 日誌

logfile /var/log/ha-log

logfacility local0

# 心跳監測時間

keepalive 2

# 死亡時間

deadtime 5

# 指定對方IP：

ucast eth0 192.168.7.89

# 伺服器正常後由主伺服器接管資源，另一臺伺服器放棄該資源

auto_failback off

#定義節點

node drbd1.example.com drbd2.example.com

---------------

（Secondary)

# vi /etc/ha.d/ha.cf

---------------

# 日誌

logfile /var/log/ha-log

logfacility local0

# 心跳監測時間

keepalive 2

# 死亡時間

deadtime 5

# 指定對方IP：

ucast eth0 192.168.7.88

# 伺服器正常後由主伺服器接管資源，另一臺伺服器放棄該資源

auto_failback off

#定義節點

node drbd1.example.com drbd2.example.com

---------------

編輯雙機互聯驗證檔案：(Primary,Secondary)

# vim /etc/ha.d/authkeys

--------------

auth 1

1 crc

--------------

# chmod 600 /etc/ha.d/authkeys

編輯叢集資原始檔：(Primary，Secondary)

# vim /etc/ha.d/haresources

--------------

drbd1.example.com IPaddr::192.168.7.90/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/data::ext4 killnfsd

--------------

該檔案內IPaddr,Filesystem等指令碼存放路徑在/etc/ha.d/resource.d/下

編輯指令碼檔案killnfsd,用來重啟NFS服務：

注：因為NFS服務切換後，必須重新mount NFS共享出來的目錄，否則會報錯。

# vim /etc/ha.d/resource.d/killnfsd

-----------------

killall -9 nfsd; /etc/init.d/nfs restart;exit 0

-----------------

賦予執行許可權：

# chmod 755 /etc/ha.d/resource.d/killnfsd

建立DRBD指令碼檔案drbddisk:(Primary，Secondary)

注：

此處又是一個大坑，如果不明白Heartbeat目錄結構的朋友估計要在這裡被卡到死，因為預設yum安裝Heartbeat，不會在/etc/ha.d/resource.d/建立drbddisk指令碼，而且也無法在安裝後從本地其他路徑找到該檔案。

此處本人也是因為啟動Heartbeat後無法PING通虛IP，最後透過檢視/var/log/ha-log日誌，找到一行

ERROR: Cannot locate resource script drbddisk

然後進而到/etc/ha.d/resource.d/路徑下發現竟然沒有drbddisk指令碼，最後在google上找到該程式碼，建立該指令碼，終於測試透過：

# vim /etc/ha.d/resource.d/drbddisk

-----------------------

#!/bin/bash

#

# This script is inteded to be used as resource script by heartbeat

#

# Copright 2003-2008 LINBIT Information Technologies

# Philipp Reisner, Lars Ellenberg

#

###

DEFAULTFILE="/etc/default/drbd"

DRBDADM="/sbin/drbdadm"

if [ -f $DEFAULTFILE ]; then

. $DEFAULTFILE

fi

if [ "$#" -eq 2 ]; then

RES="$1"

CMD="$2"

else

RES="all"

CMD="$1"

fi

## EXIT CODES

# since this is a "legacy heartbeat R1 resource agent" script,

# exit codes actually do not matter that much as long as we conform to

#

# but it does not hurt to conform to lsb init-script exit codes,

# where we can.

#

#LSB-Core-generic/LSB-Core-generic/iniscrptact.html

####

drbd_set_role_from_proc_drbd()

{

local out

if ! test -e /proc/drbd; then

ROLE="Unconfigured"

return

fi

dev=$( $DRBDADM sh-dev $RES )

minor=${dev#/dev/drbd}

if [[ $minor = *[!0-9]* ]] ; then

# sh-minor is only supported since drbd 8.3.1

minor=$( $DRBDADM sh-minor $RES )

fi

if [[ -z $minor ]] || [[ $minor = *[!0-9]* ]] ; then

ROLE=Unknown

return

fi

if out=$(sed -ne "/^ *$minor: cs:/ { s/:/ /g; p; q; }" /proc/drbd); then

set -- $out

ROLE=${5%/**}

: ${ROLE:=Unconfigured} # if it does not show up

else

ROLE=Unknown

fi

}

case "$CMD" in

start)

# try several times, in case heartbeat deadtime

# was smaller than drbd ping time

try=6

while true; do

$DRBDADM primary $RES && break

let "--try" || exit 1 # LSB generic error

sleep 1

done

;;

stop)

# heartbeat (haresources mode) will retry failed stop

# for a number of times in addition to this internal retry.

try=3

while true; do

$DRBDADM secondary $RES && break

# We used to lie here, and pretend success for anything != 11,

# to avoid the reboot on failed stop recovery for "simple

# config errors" and such. But that is incorrect.

# Don't lie to your cluster manager.

# And don't do config errors...

let --try || exit 1 # LSB generic error

sleep 1

done

;;

status)

if [ "$RES" = "all" ]; then

echo "A resource name is required for status inquiries."

exit 10

fi

ST=$( $DRBDADM role $RES )

ROLE=${ST%/**}

case $ROLE in

Primary|Secondary|Unconfigured)

# expected

;;

*)

# unexpected. whatever...

# If we are unsure about the state of a resource, we need to

# report it as possibly running, so heartbeat can, after failed

# stop, do a recovery by reboot.

# drbdsetup may fail for obscure reasons, e.g. if /var/lock/ is

# suddenly readonly. So we retry by parsing /proc/drbd.

drbd_set_role_from_proc_drbd

esac

case $ROLE in

Primary)

echo "running (Primary)"

exit 0 # LSB status "service is OK"

;;

Secondary|Unconfigured)

echo "stopped ($ROLE)"

exit 3 # LSB status "service is not running"

;;

*)

# NOTE the "running" in below message.

# this is a "heartbeat" resource script,

# the exit code is _ignored_.

echo "cannot determine status, may be running ($ROLE)"

exit 4 # LSB status "service status is unknown"

;;

esac

;;

*)

echo "Usage: drbddisk [resource] {start|stop|status}"

exit 1

;;

esac

exit 0

-----------------------

賦予執行許可權：

# chmod 755 /etc/ha.d/resource.d/drbddisk

在兩個節點上啟動HeartBeat服務，先啟動Primary：(Primary,Secondary)

# service heartbeat start

# chkconfig heartbeat on

這裡能夠PING通虛IP 192.168.7.90，表示配置成功

三.配置NFS:(Primary，Secondary)

# vi /etc/exports

-----------------

/data *(rw,no_root_squash)

-----------------

重啟NFS服務：

# service rpcbind restart

# service nfs restart

# chkconfig rpcbind on

# chkconfig nfs off

這裡設定NFS開機不要自動執行，因為/etc/ha.d/resource.d/killnfsd 該指令碼內容控制NFS的啟動。

四.最終測試

在另外一臺LINUX的客戶端掛載虛IP：192.168.7.90，掛載成功表明NFS+DRBD+HeartBeat大功告成.

# mount -t nfs 192.168.7.90:/data /tmp

# df -h

---------------

......

192.168.7.90:/data 1020M 34M 934M 4% /tmp

---------------

測試DRBD+HeartBeat+NFS可用性：

1.向掛載的/tmp目錄傳送檔案，忽然重新啟動主端DRBD服務，檢視變化

經本人測試能夠實現斷點續傳

2.正常狀態重啟Primary主機後，觀察主DRBD狀態是否恢復Primary並能正常被客戶端掛載並且之前寫入的檔案存在，可以正常再寫入檔案。

經本人測試可以正常恢復，且客戶端無需重新掛載NFS共享目錄，之前資料存在，且可直接寫入檔案。

3.當Primary主機因為硬體損壞無法立即使用，需要將Secondary提升為Primary主機，如何操作？

如果裝置能夠正常啟動則按照如下操作，無法啟動則強行提升Secondary為Primary，待當機裝置能夠正常啟動，若“腦裂”，再做後續修復工作。

首先先解除安裝客戶端掛載的NFS主機目錄

# umount /tmp

（Primary）

解除安裝DRBD裝置：

# service nfs restart

# umount /data

降權：

# drbdadm secondary r0

檢視狀態，已降權

# service drbd status

-----------------

drbd driver loaded OK; device status:

version: 8.4.3 (api:1/proto:86-101)

GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd1.example.com, 2013-05-27 20:45:19

m:res cs ro ds p mounted fstype

0:r0 Connected Secondary/Secondary UpToDate/UpToDate C

-----------------

（Secondary）

提權：

# drbdadm primary r0

檢視狀態，已提權：

# service drbd status

----------------

drbd driver loaded OK; device status:

version: 8.4.3 (api:1/proto:86-101)

GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd2.example.com, 2013-05-27 20:49:06

m:res cs ro ds p mounted fstype

0:r0 Connected Primary/Secondary UpToDate/UpToDate C

----------------

這裡還未掛載DRBD目錄，讓Heartbeat幫忙掛載：

注：若重啟過程中發現Heartbeat日誌報錯：

ERROR: glib: ucast: error binding socket. Retrying: Permission denied

請檢查selinux是否關閉

# service heartbeat restart

# service drbd status

-----------------------

drbd driver loaded OK; device status:

version: 8.4.3 (api:1/proto:86-101)

GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd2.example.com, 2013-05-27 20:49:06

m:res cs ro ds p mounted fstype

0:r0 Connected Primary/Secondary UpToDate/UpToDate C /data ext4

------------------------

成功讓HeartBeat掛載DRBD目錄

重新在客戶端做NFS掛載測試：

# mount -t nfs 192.168.7.90:/data /tmp

# ll /tmp

------------------

1 10 2 2222 3 4 5 6 7 8 9 lost+found orbit-root

------------------

重啟剛剛被提權的主機，待重啟檢視狀態：

# service drbd status

------------------------

drbd driver loaded OK; device status:

version: 8.4.3 (api:1/proto:86-101)

GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd2.example.com, 2013-05-27 20:49:06

m:res cs ro ds p mounted fstype

0:r0 WFConnection Primary/Unknown UpToDate/DUnknown C /data ext4

------------------------

HeartBeat成功掛載DRBD目錄，且客戶端無縫透明使用NFS掛載點。

4.測試最後剛才那臺當機重新恢復正常後，他是否會從新奪取Primary資源？

重啟後不會重新獲取資源，需手動切換主從許可權方可。

注：vi /etc/ha.d/ha.cf配置檔案內該引數：

--------------------

auto_failback off

--------------------

表示伺服器正常後由新的主伺服器接管資源，另一臺舊伺服器放棄該資源

-------大功告成----------

CentOS 6.3下DRBD + HeartBeat + NFS配置筆記

相關文章