一次資料庫HANG處理
還是幫別人處理的,不知道我為什麼運氣好,資料庫就是不宕
他的資料庫HANG住了,SQLPLUS無法登陸,一直HANG在那裡
一般情況下,10g處理方法應該為sqlplus -prelim / as sysdba 連線資料庫,注意這種方式連線資料庫,是無法關閉資料庫的,只能進行SYSTEMSTATE
SQL> oradebug setmypid
SQL> oradebug unlimit
SQL> oradebug dump systemstate 266
SQL> oradebug tracefile_name
獲得TRC檔案後,可以用ass109.awk來格式化下看看
System State 1
~~~~~~~~~~~~~~~~
1:
2: last wait for 'cursor: mutex X'
3: waiting for 'rdbms ipc message' wait
4: waiting for 'rdbms ipc message' wait
5: waiting for 'rdbms ipc message' wait
6: waiting for 'rdbms ipc message' wait
7: waiting for 'rdbms ipc message' wait
8: waiting for 'latch: library cache' [Latch 7000001302e0170] wait
9: last wait for 'rdbms ipc message'
10: last wait for 'latch: library cache'[Latch 7000001302e0030]
11: waiting for 'latch: library cache' [Latch 7000001302e0170] wait
12: waiting for 'latch: library cache' [Latch 7000001302e0170] wait
13:
14:
15: waiting for 'Streams AQ: qmn slave idle wait' wait
16: waiting for 'rdbms ipc message' wait
17: waiting for 'rdbms ipc message' wait
18: last wait for 'latch: library cache'[Latch 7000001302e0350]
19: for 'Streams AQ: waiting for time management or cleanup tasks' wait
20: waiting for 'Streams AQ: qmn coordinator idle wait' wait
21: waiting for 'SQL*Net message from client' wait
22: [Latch 7000001302e0030]
23: waiting for 'SQL*Net message from client' wait
24: [Latch 7000001302e0030]
25: waiting for 'SQL*Net message from client' wait
26: waiting for 'SQL*Net message from client' wait
27: [Latch 7000000100e5020]
28: waiting for 'latch: library cache' [Latch 7000001302e0030] wait
29: waiting for 'latch: library cache' [Latch 7000001302e0350] wait
30: waiting for 'latch: library cache' [Latch 7000001302e0170] wait
31: last wait for 'latch: library cache'[Latch 7000001302e03f0]
32: waiting for 'SQL*Net message from client' wait
33: last wait for 'latch: library cache'[Latch 7000001302e0030]
34: last wait for 'latch: library cache'[Latch 7000001302e0030]
35: waiting for 'latch: library cache' [Latch 7000001302e0170] wait
36: waiting for 'SQL*Net message from client' wait
................................
735: [Latch 70000001000c240]
736: [Latch 70000001000c240]
737: [Latch 70000001000c240]
738: [Latch 70000001000c240]
739: [Latch 70000001000c240]
740: [Latch 70000001000c240]
Blockers
~~~~~~~~
Above is a list of all the processes. If they are waiting for a resource
then it will be given in square brackets. Below is a summary of the
waited upon resources, together with the holder of that resource.
Notes:
~~~~~
o A process id of '???' implies that the holder was not found in the
systemstate.
Resource Holder State
Latch 7000001302e0170 ??? Blocker
Latch 7000001302e0030 ??? Blocker
Latch 7000001302e0350 ??? Blocker
Latch 7000000100e5020 ??? Blocker
Latch 7000001302e03f0 ??? Blocker
Latch 7000000100e50c0 302: Self-Deadlock
Latch 700000010020f20 ??? Blocker
Latch 70000001000c240 660: 660: is waiting for 660: 699:
Latch 70000001000c240 699: 699: is waiting for 660: 699:
Object Names
~~~~~~~~~~~~
Latch 7000001302e0170 Child library cache
Latch 7000001302e0030 Child library cache
Latch 7000001302e0350 Child library cache
Latch 7000000100e5020 Child shared pool
Latch 7000001302e03f0 Child library cache
Latch 7000000100e50c0 Child shared pool
Latch 700000010020f20 user lock
Latch 70000001000c240 holding (efd=3) 70000001000c240 OS proces
首先302的Self-Deadlock肯定是有問題的,其持有的LATCH無法釋放,那麼在SYSTEMSTATE中找 PROCESS 302 看看
PROCESS 302:
----------------------------------------
SO: 700000138541e28, type: 2, owner: 0, flag: INIT/-/-/0x00
(process) Oracle pid=302, calls cur/top: 0/7000000746b9fd8, flag: (0) -
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 0 0 0
last post received-location: No post
last process to post me: none
last post sent: 0 0 0
last post sent-location: No post
last process posted by me: none
(latch info) wait_event=0 bits=0
Location from where call was made: kghalo:
waiting for 7000000100e50c0 Child shared pool level=7 child#=2
Location from where latch is held: kgh: add extent to reserved list:
Context saved from call: 0
state=busy, wlstate=free
waiters [orapid (seconds since: put on list, posted, alive check)]:
97 (936, 1258509061, 2)
46 (936, 1258509061, 2)
82 (933, 1258509061, 2)
114 (930, 1258509061, 2)
108 (930, 1258509061, 2)
127 (930, 1258509061, 2)
....(100的個,這裡省略掉)
636 (495, 1258509061, 2)
633 (495, 1258509061, 2)
628 (495, 1258509061, 2)
634 (495, 1258509061, 2)
632 (495, 1258509061, 2)
640 (486, 1258509061, 2)
641 (486, 1258509061, 2)
642 (486, 1258509061, 2)
waiter count=262
gotten 728126853 times wait, failed first 151722 sleeps 66474
gotten 0 times nowait, failed: 0
possible holder pid = 92 spid=929882
on wait list for 7000000100e50c0
acquiring 7000000100e50c0
這裡可以看到,其在等待7000000100e50c0,但是其卻持有7000000100e50c0,自己把自己搞死掉了
而且,還有大量的SESSION在7000000100e50c0這個LATCH上等待,有100多個
找了個SQLPLUS進去的程式看,確認其也在等待該LATCH
找到了SQLPLUS HANG住的PROCESS 738,確定其在等待70000001000c240
PROCESS 738:
----------------------------------------
SO: 7000001385ad2e8, type: 2, owner: 0, flag: INIT/-/-/0x00
(process) Oracle pid=738, calls cur/top: 0/0, flag: (0) -
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 0 0 0
last post received-location: No post
last process to post me: none
last post sent: 0 0 0
last post sent-location: No post
last process posted by me: none
(latch info) wait_event=0 bits=0
Location from where call was made: kso_new_process:
waiting for 70000001000c240 OS process: request allocation level=6
Location from where latch is held: kso_new_process:
Context saved from call: 0
state=busy, wlstate=free
waiters [orapid (seconds since: put on list, posted, alive check)]:
643 (523, 1258509101, 3)
644 (523, 1258509101, 0)
645 (523, 1258509101, 3)
該LATCH的持有者為PROCESS 642
PROCESS 642:
----------------------------------------
SO: 7000001385958e8, type: 2, owner: 0, flag: INIT/-/-/0x00
(process) Oracle pid=642, calls cur/top: 0/0, flag: (0) -
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 0 0 0
last post received-location: No post
last process to post me: none
last post sent: 0 0 0
last post sent-location: No post
last process posted by me: none
(latch info) wait_event=0 bits=40
Location from where call was made: kghalo:
waiting for 7000000100e50c0 Child shared pool level=7 child#=2
Location from where latch is held: kgh: add extent to reserved list:
Context saved from call: 0
state=busy, wlstate=free
waiters [orapid (seconds since: put on list, posted, alive check)]:
97 (970, 1258509095, 3)
46 (970, 1258509095, 0)
82 (967, 1258509095, 3)
114 (964, 1258509095, 3)
108 (964, 1258509095, 3)
......
gotten 728126853 times wait, failed first 151722 sleeps 66474
gotten 0 times nowait, failed: 0
possible holder pid = 92 spid=929882
on wait list for 7000000100e50c0
holding (efd=3) 70000001000c240 OS process: request allocation level=6
Location from where latch is held: kso_new_process:
Context saved from call: 0
state=busy, wlstate=free
waiters [orapid (seconds since: put on list, posted, alive check)]:
643 (517, 1258509095, 0)
644 (517, 1258509095, 0)
645 (517, 1258509095, 0)
該程式也在等待 7000000100e50c0
所以處理方法也很明顯了,殺掉PROCESS 302,並附帶殺掉660,那個程式也死鎖了。還有那些Holder為???的程式
不過,做HANGANALYZE竟然沒有發現這個LATCH上的死鎖,真奇怪
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/8242091/viewspace-619991/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 一次資料庫hang的處理資料庫
- 資料庫hang住,分析處理資料庫
- 記一次Oracle資料庫無響應(hang住)故障的處理Oracle資料庫
- 9i資料庫hang死分析處理資料庫
- 記一次資料庫hang住了資料庫
- 記一次:歸檔檔案系統問題導致資料庫hang處理資料庫
- 一次ORACLE資料庫undo壞塊處理Oracle資料庫
- 一次OWB資料庫效能問題處理資料庫
- 資料庫HANG著資料庫
- ORACLE資料庫壞塊的處理 (一次壞快處理過程)Oracle資料庫
- 一次資料庫異常的處理過程資料庫
- 資料庫突然hang了資料庫
- 一次資料庫不能歸檔問題的處理資料庫
- 記一次 MySQL 資料庫單表恢復事故處理MySql資料庫
- 記一次資料庫高CPU佔用率處理過程資料庫
- 【轉】 一次資料庫不能歸檔問題的處理資料庫
- 一次客戶資料庫CPU 100%處理過程資料庫
- 【Oracle】資料庫hang 診斷Oracle資料庫
- APM RUEI processor處理程式hang死處理方法
- 資料庫壞塊處理資料庫
- zabbix資料庫日常處理資料庫
- java處理資料庫date型別資料Java資料庫型別
- 資料庫如何處理大資料訪問資料庫大資料
- 一次資料庫崩潰處理事件資料庫事件
- 導數時資料庫hang住分析資料庫
- 資料庫異常hang住解決資料庫
- 資料庫cpu高處理一則資料庫
- oradebug處理DB hang情況
- os thread startup等待事件HANG處理thread事件
- sqlplus 無響應 hang處理SQL
- 記一次Linux redhat 4.7下Oracle10.1.0.3資料庫故障處理LinuxRedhatOracle資料庫
- 一次資料庫無法啟動問題的處理-ORA-00845資料庫
- 資料庫Hang住怎麼辦 - HANGANALYZE資料庫
- 資料庫hang著時採取辦法資料庫
- 儲存壞道造成資料庫hang住資料庫
- 用SQL Server資料庫處理資料層錯誤SQLServer資料庫
- 誤刪資料庫資料檔案的處理方法資料庫
- Python資料處理(二):處理 Excel 資料PythonExcel