ORA-7445(dbgrlWriteAlertDetail_int)和ORA-4030導致例項崩潰

yangtingkun發表於2012-05-13

客戶11.2.0.2 RAC for Solaris 10 sparc單例項出現ORA-7445ORA-4030操作,導致例項崩潰。

 

 

這個錯誤比較嚴重:

2012-05-04 02:02:26.403000 +08:00
Archived Log entry 949 added for thread 1 sequence 518 ID 0x70a64e83 dest 1:
Archived Log entry 950 added for thread 1 sequence 519 ID 0x70a64e83 dest 1:
Thread 1 advanced to log sequence 521 (after internal thread enable)
Thread 2 opened at log sequence 441
Current log# 4 seq# 441 mem# 0: /orcldata1/orcl/redo04.log
Current log# 4 seq# 441 mem# 1: /orcldata2/orcl/redo04.log
Successful open of redo thread 2
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
2012-05-04 02:02:27.570000 +08:00
[374] Successfully onlined Undo Tablespace 5.
Undo initialization finished serial:0 start:903149023 end:903149480 diff:457 (4 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
Redo thread 1 internally disabled at seq 521 (CKPT)
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Archived Log entry 951 added for thread 1 sequence 520 ID 0x70a64e83 dest 1:
ARC3: Archiving disabled thread 1 sequence 521
Archived Log entry 952 added for thread 1 sequence 521 ID 0x70a64e83 dest 1:
No Resource Manager plan active
minact-scn: Inst 2 is now the master inc#:2 mmon proc-id:326 status:0x7
minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000
2012-05-04 02:02:29.806000 +08:00
Starting background process GTX0
GTX0 started with pid=72, OS id=660
Starting background process RCBG
RCBG started with pid=73, OS id=662
replication_dependency_tracking turned off (no async multimaster replication found)
Thread 2 advanced to log sequence 442 (LGWR switch)
Current log# 3 seq# 442 mem# 0: /orcldata1/orcl/redo03.log
Current log# 3 seq# 442 mem# 1: /orcldata2/orcl/redo03.log
Archived Log entry 953 added for thread 2 sequence 441 ID 0x70a64e83 dest 1:
2012-05-04 02:02:31.733000 +08:00
Starting background process QMNC
2012-05-04 02:02:37.268000 +08:00
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xFFFFFFFF7FFF5EF8] [PC:0x108068724, dbgrlWriteAlertDetail_int()+132] [flags: 0x0, count: 1]
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xFFFFFFFF7FFF5FF4] [PC:0xFFFFFFFF7BD00F34, _memset()+52] [flags: 0x0, count: 1]
2012-05-04 02:02:38.594000 +08:00
Errors in file /opt/oracle/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_m000_666.trc (incident=156594):
ORA-07445: exception encountered: core dump [dbgrlWriteAlertDetail_int()+132] [SIGSEGV] [ADDR:0xFFFFFFFF7FFF5EF8] [PC:0x108068724] [Address not mapped to obj
ect] []
ORA-04030: out of process memory when trying to allocate 67108896 bytes (qesmmCheckPgaL,qesmmCheckPgaLimit:mem)
Incident details in: /opt/oracle/app/oracle/diag/rdbms/orcl/orcl2/incident/incdir_156594/orcl2_m000_666_i156594.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /opt/oracle/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_psp0_266.trc (incident=156026):
ORA-07445: exception encountered: core dump [_memset()+52] [SIGSEGV] [ADDR:0xFFFFFFFF7FFF5FF4] [PC:0xFFFFFFFF7BD00F34] [Address not mapped to object] []
Incident details in: /opt/oracle/app/oracle/diag/rdbms/orcl/orcl2/incident/incdir_156026/orcl2_psp0_266_i156026.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2012-05-04 02:02:49.732000 +08:00
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xFFFFFFFF7FFF2000] [PC:0x107D04724, dbgemdGetCallStackWFlag()+100] [flags: 0x0, count: 1]
Errors in file /opt/oracle/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_dia0_282.trc:
ORA-07445: exception encountered: core dump [dbgemdGetCallStackWFlag()+100] [SIGSEGV] [ADDR:0xFFFFFFFF7FFF2000] [PC:0x107D04724] [Address not mapped to objec
t] []
ORA-04030: out of process memory when trying to allocate 816 bytes (ksdhngmemctx_h,ksdhng:enod)
2012-05-04 02:02:55.200000 +08:00
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xFFFFFFFF7FFF2000] [PC:0x107D04724, dbgemdGetCallStackWFlag()+100] [flags: 0x0, count: 1]
Errors in file /opt/oracle/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_diag_274.trc:
ORA-07445: exception encountered: core dump [dbgemdGetCallStackWFlag()+100] [SIGSEGV] [ADDR:0xFFFFFFFF7FFF2000] [PC:0x107D04724] [Address not mapped to objec
t] []
ORA-04030: out of process memory when trying to allocate 32128 bytes (pga heap,grpsvc msg)
2012-05-04 02:02:58.724000 +08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2012-05-04 02:04:01.919000 +08:00
PMON (ospid: 264): terminating the instance due to error 490
2012-05-04 02:04:05.129000 +08:00
Instance terminated by PMON, pid = 264

資料庫的啟動操作還沒完成,就碰到ORA-7445 [dbgrlWriteAlertDetail_int]錯誤,隨後是ORA-04030錯誤,接著是ORA-07445[_memset]錯誤,最後是ORA-7445[dbgemdGetCallStackWFlag]錯誤。而這一系列的錯誤出現,最終導致了PMON結束了資料庫例項。

從錯誤資訊上看,和記憶體分配有關,但是資料庫剛啟動,怎麼會連67M的記憶體都無法分配呢,查詢MOS發現,原來是SWAP空間耗盡導致的,詳細描述可以參考Instance crash ORA-7445 [_memset()+120] and ORA-4030 (QERHJ hash-joi,kllcqas:kllsltba) [ID 1071033.1]

檢查系統的日誌資訊message

May 4 02:02:14 orcl2 Had[5187]: [ID 702911 daemon.notice] VCS CRITICAL V-16-1-50086 Swap usage on orcl2 is 97%
May 4 02:02:33 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 666 (oracle)
May 4 02:02:33 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 674 (oracle)
May 4 02:02:33 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 668 (oracle)
May 4 02:02:34 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 266 (oracle)
May 4 02:02:35 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 666 (oracle)
May 4 02:02:35 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 668 (oracle)
May 4 02:02:35 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 674 (oracle)
May 4 02:02:35 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 266 (oracle)
May 4 02:02:36 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 666 (oracle)
May 4 02:02:36 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 674 (oracle)
May 4 02:02:36 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 668 (oracle)
May 4 02:02:36 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 266 (oracle)
May 4 02:02:37 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 666 (oracle)
May 4 02:02:37 orcl2 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 668 (oracle)

果然找到大量的SWAP空間不足的告警,對於Solaris系統而言,清理/tmp空間,然後重啟資料庫,問題不再出現。

 

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/4227/viewspace-730022/,如需轉載,請註明出處,否則將追究法律責任。

相關文章