oracle11g RAC 啟動時無法識別ASM

super_sky發表於2014-01-23

環境:RHEL5.5 + oracle11g RAC

客戶聯絡說關閉cluster後,重啟啟動,發現CRS無法啟動。提示Cannot communicate with Cluster Ready Services。

登入主機檢查

[root@rac-2 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

檢查RAC的日誌

[grid@rac-2 rac-2]$ tail -100 alertrac-2.log | more

2014-01-23 03:16:17.396
[ohasd(3899)]CRS-2765:Resource 'ora.crsd' has failed on server 'rac-2'.
2014-01-23 03:16:18.697
[crsd(22676)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /u02/11.2.0/grid/log/rac-2/crsd/crsd.log.
2014-01-23 03:16:18.704
[crsd(22676)]CRS-0804:Cluster Ready Service aborted due to Oracle Cluster Registry error [PROC-26: Error while accessing the physical storage
ORA-15077: could not locate ASM instance serving a required diskgroup
]. Details at (:CRSD00111:) in /u02/11.2.0/grid/log/rac-2/crsd/crsd.log.
2014-01-23 03:16:19.433
[ohasd(3899)]CRS-2765:Resource 'ora.crsd' has failed on server 'rac-2'.
2014-01-23 03:16:20.737
[crsd(22685)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /u02/11.2.0/grid/log/rac-2/crsd/crsd.log.
2014-01-23 03:16:20.747
[crsd(22685)]CRS-0804:Cluster Ready Service aborted due to Oracle Cluster Registry error [PROC-26: Error while accessing the physical storage
ORA-15077: could not locate ASM instance serving a required diskgroup
]. Details at (:CRSD00111:) in /u02/11.2.0/grid/log/rac-2/crsd/crsd.log.
2014-01-23 03:16:21.473
[ohasd(3899)]CRS-2765:Resource 'ora.crsd' has failed on server 'rac-2'.
2014-01-23 03:16:21.473
[ohasd(3899)]CRS-2771:Maximum restart attempts reached for resource 'ora.crsd'; will not restart.

檢查crsd.log

2014-01-23 03:16:20.461: [ CRSMAIN][1106286912] Policy Engine is not initialized yet!
2014-01-23 03:16:20.463: [ CRSMAIN][3556262304] Initializing OCR
[   CLWAL][3556262304]clsw_Initialize: OLR initlevel [70000]
2014-01-23 03:16:20.735: [  OCRASM][3556262304]proprasmo: Error in open/create file in dg [ORC_VOTE]
[  OCRASM][3556262304]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge

2014-01-23 03:16:20.735: [  OCRASM][3556262304]ASM Error Stack : ORA-15077: could not locate ASM instance serving a required diskgroup

2014-01-23 03:16:20.737: [  OCRASM][3556262304]proprasmo: kgfoCheckMount returned [7]
2014-01-23 03:16:20.737: [  OCRASM][3556262304]proprasmo: The ASM instance is down
2014-01-23 03:16:20.738: [  OCRRAW][3556262304]proprioo: Failed to open [+ORC_VOTE]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2014-01-23 03:16:20.738: [  OCRRAW][3556262304]proprioo: No OCR/OLR devices are usable
2014-01-23 03:16:20.738: [  OCRASM][3556262304]proprasmcl: asmhandle is NULL
2014-01-23 03:16:20.738: [    GIPC][3556262304] gipcCheckInitialization: possible incompatible non-threaded init from [prom.c : 690], original from [clsss.c : 5326]
2014-01-23 03:16:20.740: [ default][3556262304]clsvactversion:4: Retrieving Active Version from local storage.
2014-01-23 03:16:20.742: [  OCRRAW][3556262304]proprrepauto: The local OCR configuration matches with the configuration published by OCR Cache Writer. No repair required.
2014-01-23 03:16:20.745: [  OCRRAW][3556262304]proprinit: Could not open raw device
2014-01-23 03:16:20.745: [  OCRASM][3556262304]proprasmcl: asmhandle is NULL
2014-01-23 03:16:20.746: [  OCRAPI][3556262304]a_init:16!: Backend init unsuccessful : [26]
2014-01-23 03:16:20.747: [  CRSOCR][3556262304] OCR context init failure.  Error: PROC-26: Error while accessing the physical storage
ORA-15077: could not locate ASM instance serving a required diskgroup

2014-01-23 03:16:20.748: [ CRSMAIN][3556262304] Created alert : (:CRSD00111:) :  Could not init OCR, error: PROC-26: Error while accessing the physical storage
ORA-15077: could not locate ASM instance serving a required diskgroup

2014-01-23 03:16:20.748: [    CRSD][3556262304][PANIC] CRSD exiting: Could not init OCR, code: 26
2014-01-23 03:16:20.748: [    CRSD][3556262304] Done.

從錯誤資訊判斷是ASM出現了問題,檢查ASM磁碟

[root@rac-2 ~]# /etc/init.d/oracleasm listdisks
ASMDATA01
ASMDATA02
ASMDATA03
OCR_VOTE

磁碟是存在的。

關閉CRS後,檢查CRS相關程式

[root@rac-2 ~]# ps -ef | grep d.bin
root      3899     1  0 Jan13 ?        00:18:59 /u02/11.2.0/grid/bin/ohasd.bin reboot
grid      4267     1  0 Jan13 ?        00:34:32 /u02/11.2.0/grid/bin/oraagent.bin
grid      4280     1  0 Jan13 ?        00:00:16 /u02/11.2.0/grid/bin/mdnsd.bin
grid      4293     1  0 Jan13 ?        00:06:10 /u02/11.2.0/grid/bin/gpnpd.bin
root      4304     1  0 Jan13 ?        01:31:25 /u02/11.2.0/grid/bin/orarootagent.bin
grid      4307     1  0 Jan13 ?        00:27:27 /u02/11.2.0/grid/bin/gipcd.bin
root      4322     1  0 Jan13 ?        00:45:33 /u02/11.2.0/grid/bin/osysmond.bin
root      4332     1  0 Jan13 ?        00:01:24 /u02/11.2.0/grid/bin/cssdmonitor
root      4350     1  0 Jan13 ?        00:02:39 /u02/11.2.0/grid/bin/cssdagent
grid      4362     1  0 Jan13 ?        01:45:38 /u02/11.2.0/grid/bin/ocssd.bin
root      4437     1  0 Jan13 ?        00:28:42 /u02/11.2.0/grid/bin/octssd.bin reboot
grid      4461     1  0 Jan13 ?        00:00:22 /u02/11.2.0/grid/bin/evmd.bin
grid      4843  4461  0 Jan13 ?        00:00:00 /u02/11.2.0/grid/bin/evmlogger.bin -o /u02/11.2.0/grid/evm/log/evmlogger.info -l /u02/11.2.0/grid/evm/log/evmlogger.log
root      4941     1  0 Jan13 ?        00:21:18 /u02/11.2.0/grid/bin/ologgerd -m rac-1 -r -d /u02/11.2.0/grid/crf/db/rac-2
root     23122 22979  0 03:54 pts/3    00:00:00 grep d.bin

CRS已經關閉,但是好多程式沒有釋放。手動kill掉這些程式

[root@rac-2 ~]# ps -ef | grep d.bin | awk '{print $2}' | xargs kill -9
kill 23131: No such process

重啟CRS,問題解決。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/11590946/viewspace-1074513/,如需轉載,請註明出處,否則將追究法律責任。

相關文章