一次資料庫hang的處理

zhyuh發表於2010-06-21
資料庫hang住的時候,應用自然是掛了,連sysdba都無法登陸。只能啟用sqlplus的-prelim引數。[@more@]

sqlplus -prelim "/as sysdba"
oradebug setmypid
oradebug unlimit
oradebug hanganalyze 3
--90 sec
oradebug hanganalyze 3
exit

sqlplus -prelim "/as sysdba"
oradebug setmypid
oradebug unlimit
oradebug dump systemstate 10
--wait 90 seconds
oradebug dump systemstate 10
--wait 90 seconds
oradebug dump systemstate 10
exit

找到hanganalyze的trace檔案,有如下資訊:
Found 12 objects waiting for
<0/132/62418/0x77005770/3398/No Wait>
Open chains found:
Chain 1 : :
<0/132/62418/0x77005770/3398/No Wait>
-- <0/34/35823/0x37008ec8/3587/latch: library cache>
Other chains found:
......
......

OS的3398程式是個可疑的程式,systemstate的trace檔案驗證了該程式造成了阻賽:
............
Location from where call was made: kglScanDependency:
Context saved from call: 4
waiting for 3d3f458e8 Child library cache level=5 child#=9
Location from where latch is held: kglic: child
Context saved from call: 4
state=busy, wlstate=free
waiters [orapid (seconds since: put on list, posted, alive check)]:
16 (643, 1275037904, 643)
22 (627, 1275037904, 627)
26 (613, 1275037904, 613)
56 (574, 1275037904, 574)
30 (480, 1275037904, 480)
44 (423, 1275037904, 423)
17 (420, 1275037904, 420)
37 (408, 1275037904, 408)
52 (381, 1275037904, 381)
39 (351, 1275037904, 351)
59 (345, 1275037904, 286)
40 (318, 1275037904, 318)
80 (258, 1275037904, 258)
72 (237, 1275037904, 237)
14 (123, 1275037904, 123)
62 (84, 1275037904, 84)
13 (48, 1275037904, 48)
42 (19, 1275037904, 19)
waiter count=18
gotten 3142055352 times wait, failed first 786017316 sleeps 3103952
gotten 449701 times nowait, failed: 339338
possible holder pid = 36 ospid=3398
on wait list for 3d3f458e8
............

因為資料庫掛住了,要從資料庫裡面獲取該程式的詳細資訊,只能再做dump。
sqlplus -prelim "/as sysdba"
oradebug setospid 3398
oradebug unlimit
oradebug dump events 1
從這個trace裡發現該程式是I3檢測工具的一個會話,正在訪問v$sql_bind_capture檢視。I3是個效能監控的工具,其間也需要從各個檢視收集效能資訊。
後面處理的步驟略。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/207/viewspace-1034528/,如需轉載,請註明出處,否則將追究法律責任。

相關文章