今天發現有一VCS叢集狀態變為 STALE_ADMIN_WAIT，解決方法如下：
1.首先檢視兩臺機器的當前狀態
cp-etl01:/etc/VRTSvcs/conf/config # hastatus -sum

-- SYSTEM STATE
-- System State Frozen

A cp-etl01 STALE_ADMIN_WAIT 0
A cp-etl02 STALE_ADMIN_WAIT 0

[@more@]

2.在兩臺機器上檢視當前的程式
cp-etl01:/etc/VRTSvcs/conf/config # ps -ef |grep had 檢視HA程式
root 7243 1 0 2009 ? 00:00:00 /opt/VRTSvcs/bin/hashadow
root 4683 1 0 Aug24 ? 00:00:02 /opt/VRTSvcs/bin/had -restart
root 19294 17911 0 11:21 pts/7 00:00:00 grep had

cp-etl02:~ # ps -ef | grep had
root 7278 1 0 2009 ? 00:00:00 /opt/VRTSvcs/bin/hashadow
root 23411 1 0 Aug24 ? 00:00:01 /opt/VRTSvcs/bin/had -restart
root 7012 6981 0 11:22 pts/0 00:00:00 grep had

可以看到兩臺機器的程式都正常啟動，但是had程式不是正常狀態，需要restart

3.看看能不能識別對方未尾是01
cp-etl01:/etc/VRTSvcs/conf/config # gabconfig –a
GAB Port Memberships
===============================================================
Port a gen 1bc510 membership 01
Port h gen 1bc51b membership 01

cp-etl02:~ # gabconfig –a 看看能不能識別對方未尾是01
GAB Port Memberships
===============================================================
Port a gen 1bc510 membership 01
Port h gen 1bc51b membership 01

可以看到兩臺機器都能識別到對方

4.重啟叢集，在任一臺機器上執行
cp-etl01:/etc/VRTSvcs/conf/config # hastop -all -force
cp-etl01:/etc/VRTSvcs/conf/config # ps -ef |grep had
root 20025 17911 0 11:25 pts/7 00:00:00 grep had

在兩臺機器上啟動叢集
cp-etl01:/etc/VRTSvcs/conf/config # hastart
cp-etl02:~ # hastart

5.檢視狀態
cp-etl01:/etc/VRTSvcs/conf/config # ps -ef |grep had
root 20034 1 0 11:25 ? 00:00:00 /opt/VRTSvcs/bin/had
root 20036 1 0 11:25 ? 00:00:00 /opt/VRTSvcs/bin/hashadow
root 20049 17911 0 11:26 pts/7 00:00:00 grep had
cp-etl01:/etc/VRTSvcs/conf/config # hastatus -sum

-- SYSTEM STATE
-- System State Frozen

A cp-etl01 STALE_ADMIN_WAIT 0

6.使用第一臺機器強制拉動叢集
cp-etl01:/etc/VRTSvcs/conf/config # hostname
cp-etl01

cp-etl01:/etc/VRTSvcs/conf/config # hasys -force cp-etl01
You have new mail in /var/spool/mail/root
cp-etl01:/etc/VRTSvcs/conf/config # hastatus -sum
-- SYSTEM STATE
-- System State Frozen

A cp-etl01 RUNNING 0
A cp-etl02 RUNNING 0

-- GROUP STATE
-- Group System Probed AutoDisabled State

B ETL01 cp-etl01 Y N PARTIAL
B ETL01 cp-etl02 Y N OFFLINE
B ETL02 cp-etl01 Y N OFFLINE
B ETL02 cp-etl02 Y N ONLINE

可以看到叢集已經是正常狀態了，但是現在叢集還不能保護應用，如果應用程式斷掉，則不會進行切換

VCS叢集狀態為 STALE_ADMIN_WAIT的解決

相關文章