私有網路介面丟失導致例項崩潰

yangtingkun發表於2012-11-28

客戶10.2.0.4 RAC資料庫出現網路異常,導致例項崩潰並伴隨大量ORA-27300錯誤。

[@more@]

詳細錯誤資訊為:

Wed Nov 21 16:37:36 2012
Errors in file /u01/oracle/app/admin/orcl/udump/orcl2_ora_29173.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:if_not_found failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpvaddr9
ORA-27303: additional information: requested interface 10.0.1.2 not found. Check output from ifconfig command
Wed Nov 21 16:37:36 2012
Errors in file /u01/oracle/app/admin/orcl/udump/orcl2_ora_29198.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:if_not_found failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpvaddr9
ORA-27303: additional information: requested interface 10.0.1.2 not found. Check output from ifconfig command
Wed Nov 21 16:37:56 2012
Trace dumping is performing id=[cdmp_20121121163746]
Wed Nov 21 16:38:00 2012
ospid 28424: network interface with IP address 10.0.1.2 no longer operational
requested interface 10.0.1.2 not found. Check output from ifconfig command
Wed Nov 21 16:38:07 2012
Error: KGXGN aborts the instance (6)
Wed Nov 21 16:38:07 2012
Errors in file /u01/oracle/app/admin/orcl/bdump/orcl2_lmon_28422.trc:
ORA-29702: error occurred in Cluster Group Service operation
LMON: terminating instance due to error 29702
Wed Nov 21 16:38:07 2012
Errors in file /u01/oracle/app/admin/orcl/bdump/orcl2_lms1_28430.trc:
ORA-29702: error occurred in Cluster Group Service operation
Wed Nov 21 16:38:07 2012
Errors in file /u01/oracle/app/admin/orcl/bdump/orcl2_lms3_28438.trc:
ORA-29702: error occurred in Cluster Group Service operation
.
.
.
Wed Nov 21 16:38:09 2012
Errors in file /u01/oracle/app/admin/orcl/bdump/orcl2_j000_28635.trc:
ORA-29702: error occurred in Cluster Group Service operation
ORA-29702: error occurred in Cluster Group Service operation
Wed Nov 21 16:38:09 2012
Errors in file /u01/oracle/app/admin/orcl/bdump/orcl2_mman_28450.trc:
ORA-29702: error occurred in Cluster Group Service operation
Wed Nov 21 16:38:09 2012
Errors in file /u01/oracle/app/admin/orcl/bdump/orcl2_asmb_28496.trc:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel
Wed Nov 21 16:38:10 2012
Errors in file /u01/oracle/app/admin/orcl/bdump/orcl2_pmon_28416.trc:
ORA-29702: error occurred in Cluster Group Service operation
Wed Nov 21 16:38:10 2012
Errors in file /u01/oracle/app/admin/orcl/bdump/orcl2_smon_28462.trc:
ORA-29702: error occurred in Cluster Group Service operation
Wed Nov 21 17:25:50 2012
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 10.0.1.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth0 172.18.19.0 configured from OCR for use as a public interface

顯然導致RAC節點當機的問題來自作業系統或硬體層。導致出現ORA-27504錯誤的原因是作業系統相關的ORA-27300ORA-27301ORA-27302ORA-27303錯誤。而這些錯誤明確的之處私有網路介面的地址無法找到,而作業系統命令ifconfig命令輸出結果異常。

Oracle的網路心跳依賴於私有網路,而出現了這個問題,導致資料庫節點崩潰也是情理之中的了。

顯然這不應該算作OraclebugOracle給出的錯誤資訊已經清晰的指明瞭問題的原因。找到導致作業系統層面網路介面失效的原因才是解決問題的關鍵。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/4227/viewspace-1060815/,如需轉載,請註明出處,否則將追究法律責任。

相關文章