【kingsql分享】解決RAC大量UNKNOWN和CRS-0216: Could not stop resource

kingsql發表於2014-10-21
RAC環境基於不同的系統,叢集件是不一樣的,例如聽說9i的資料庫有基於service guard的HP雙機,有基於HA的AIX雙機,那些都是比較悠久的架構,
遇到那種叢集件問題,需要整理卷組和叢集件,總之還是比較麻煩的。

10g基於Linux系統的Oracle官方叢集件,還是比較強大的,即使出故障,也很快能解決,其實這不是工程師的技術高了,而是叢集件的效能強了

1.RAC環境變成這樣時,例項均關閉了
[oracle@rac1 ~]$ crs_stat -t 
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....SM1.asm application    ONLINE    UNKNOWN   rac1        
ora....C1.lsnr application    ONLINE    UNKNOWN   rac1        
ora.rac1.gsd   application    ONLINE    UNKNOWN   rac1        
ora.rac1.ons   application    ONLINE    UNKNOWN   rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    OFFLINE   UNKNOWN   rac2        
ora....C2.lsnr application    OFFLINE   UNKNOWN   rac2        
ora.rac2.gsd   application    ONLINE    UNKNOWN   rac2        
ora.rac2.ons   application    ONLINE    UNKNOWN   rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        
ora.testora.db application    ONLINE    UNKNOWN   rac2        
ora....ataf.cs application    ONLINE    UNKNOWN   rac2        
ora....ra1.srv application    ONLINE    UNKNOWN   rac1        
ora....ra2.srv application    ONLINE    UNKNOWN   rac2        
ora....a1.inst application    ONLINE    OFFLINE               
ora....a2.inst application    ONLINE    OFFLINE     

2.起初打算重啟各個節點的服務
[root@rac1 ~]#  /etc/init.d/init.crs stop 
Shutting down Oracle Cluster Ready Services (CRS):
Stopping resources.
Successfully stopped CRS resources 
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
Shutdown has begun. The daemons should exit soon.
[root@rac1 ~]#  /etc/init.d/init.crs start
Startup will be queued to init within 90 seconds.

[root@rac2 ~]#  /etc/init.d/init.crs stop 
Shutting down Oracle Cluster Ready Services (CRS):
Stopping resources.
Successfully stopped CRS resources 
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
Shutdown has begun. The daemons should exit soon.
[root@rac2 ~]#  /etc/init.d/init.crs start
Startup will be queued to init within 90 seconds.

3.重啟之後檢視發現還是如此
[oracle@rac1 ~]$ crs_stat -t 
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....SM1.asm application    ONLINE    UNKNOWN   rac1        
ora....C1.lsnr application    ONLINE    UNKNOWN   rac1        
ora.rac1.gsd   application    ONLINE    UNKNOWN   rac1        
ora.rac1.ons   application    ONLINE    UNKNOWN   rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    OFFLINE   UNKNOWN   rac2        
ora....C2.lsnr application    OFFLINE   UNKNOWN   rac2        
ora.rac2.gsd   application    ONLINE    UNKNOWN   rac2        
ora.rac2.ons   application    ONLINE    UNKNOWN   rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        
ora.testora.db application    ONLINE    UNKNOWN   rac2        
ora....ataf.cs application    ONLINE    UNKNOWN   rac2        
ora....ra1.srv application    ONLINE    UNKNOWN   rac1        
ora....ra2.srv application    ONLINE    UNKNOWN   rac2        
ora....a1.inst application    ONLINE    OFFLINE               
ora....a2.inst application    ONLINE    OFFLINE     

4.於是想到,對單個UNKNOWN的服務進行重啟,但卻失敗
[oracle@rac1 ~]$ crs_stop ora.rac1.ASM1.asm
Attempting to stop `ora.rac1.ASM1.asm` on member `rac1`
`ora.rac1.ASM1.asm` on member `rac1` has experienced an unrecoverable failure.
Human intervention required to resume its availability.
CRS-0216: Could not stop resource 'ora.rac1.ASM1.asm'.

5.後來只能嘗試對其進行強制重啟!成功
[oracle@rac1 ~]$ crs_stop -f ora.rac1.ASM1.asm
Attempting to stop `ora.rac1.ASM1.asm` on member `rac1`
Stop of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
Target set to OFFLINE for `ora.testora.testora1.inst`
[oracle@rac1 ~]$ crs_start -f ora.rac1.ASM1.asm
Attempting to start `ora.rac1.ASM1.asm` on member `rac1`
Start of `ora.rac1.ASM1.asm` on member `rac1` succeeded.

6.再看忽然發現原來好多UNKNOWN的狀態已經變成ONLINE
[oracle@rac1 ~]$ crs_stat -t 
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    ONLINE    UNKNOWN   rac1        
ora.rac1.gsd   application    ONLINE    UNKNOWN   rac1        
ora.rac1.ons   application    ONLINE    UNKNOWN   rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        
ora.testora.db application    ONLINE    ONLINE    rac1        
ora....ataf.cs application    ONLINE    ONLINE    rac2        
ora....ra1.srv application    ONLINE    ONLINE    rac1        
ora....ra2.srv application    ONLINE    ONLINE    rac2        
ora....a1.inst application    ONLINE    ONLINE    rac1        
ora....a2.inst application    ONLINE    ONLINE    rac2  

7.誰還UNKNOWN,就強制重啟誰
[oracle@rac1 ~]$ crs_stop -f ora.rac1.LISTENER_RAC1.lsnr
Attempting to stop `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
Stop of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
[oracle@rac1 ~]$ crs_stop -f ora.rac1.gsd
Attempting to stop `ora.rac1.gsd` on member `rac1`
Stop of `ora.rac1.gsd` on member `rac1` succeeded.
[oracle@rac1 ~]$ crs_stop -f ora.rac1.ons
Attempting to stop `ora.rac1.ons` on member `rac1`
Stop of `ora.rac1.ons` on member `rac1` succeeded.

[oracle@rac1 ~]$ crs_start -f ora.rac1.LISTENER_RAC1.lsnr
Attempting to start `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
Start of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.            
[oracle@rac1 ~]$ crs_start -f ora.rac1.gsd
Attempting to start `ora.rac1.gsd` on member `rac1`
Start of `ora.rac1.gsd` on member `rac1` succeeded.
[oracle@rac1 ~]$ crs_start -f ora.rac1.ons
Attempting to start `ora.rac1.ons` on member `rac1`
Start of `ora.rac1.ons` on member `rac1` succeeded.

8.最後全都好了!
[oracle@rac1 ~]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    ONLINE    ONLINE    rac1        
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        
ora.testora.db application    ONLINE    ONLINE    rac1        
ora....ataf.cs application    ONLINE    ONLINE    rac2        
ora....ra1.srv application    ONLINE    ONLINE    rac1        
ora....ra2.srv application    ONLINE    ONLINE    rac2        
ora....a1.inst application    ONLINE    ONLINE    rac1        
ora....a2.inst application    ONLINE    ONLINE    rac2  

任何架構的RAC,只要叢集件修復,且沒有資料庫問題時,叢集件都會自動拉起資料


$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
我的QQ 1749160152
我的郵箱 hongzhuohui@kingsql.com
我的百科 洪卓輝
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/28389881/viewspace-1306227/,如需轉載,請註明出處,否則將追究法律責任。

相關文章