徹底搞清楚library cache lock的成因和解決方法(5)

jss001發表於2009-02-18

問題的成因已經基本上明確了,這裡推薦兩種解決問題的方法:
方法1,根據 c000000122e2a6d8 地址,我們可以得到當前在library cache中相應的鎖資訊:
SQL> l
1 select INST_ID,USER_NAME,KGLNAOBJ,KGLLKSNM,KGLLKUSE,KGLLKSES,KGLLKMOD,KGLLKREQ,KGLLKPNS,KGLLKHDL
2* from X$KGLLK where KGLLKHDL = 'C000000122E2A6D8' order by KGLLKSNM,KGLNAOBJ
SQL> /

INST_ID USER_NAME KGLNAOBJ KGLLKSNM KGLLKUSE KGLLKSES KGLLKMOD KGLLKREQ KGLLKPNS KGLLKHDL
---------- ------------- ---------------------- ---------- ---------------- ---------------- ---------- ---------- ---------------- ----------------
2 PUBUSER CSNOZ629926699966 30 C000000109F02C68 C000000109F02C68 0 2 00 C000000122E2A6D8
2 PUBUSER CSNOZ629926699966 37 C000000108C99E28 C000000108C99E28 3 0 00 C000000122E2A6D8

SQL>

按照Oracle推薦的做法,我們現在應該使用'alter system kill session'命令kill掉SID 37,結果得到了ORA-00031錯誤:
SQL> alter system kill session '37,2707';

alter system kill session '37,2707'
*
ERROR at line 1:
ORA-00031: session marked for kill

SQL>

檢查SID 37的狀態:
SQL> set linesize 150
SQL> col program for a50
SQL> select sid,serial#,status,username,program from v$session where sid=37;

SID SERIAL# STATUS USERNAME PROGRAM
---------- ---------- -------- ------------------------------ --------------------------------------------------
37 2707 KILLED PUBUSER (TNS V1-V3)

SQL>
再次證實了我們最初的想法—— 有人在執行了某個需要執行很久的DDL(多數是語句效率低,當然不排除遭遇bug的可能),
然後沒等語句結束就異常退出了會話。

這個例子中我們在上面的跟蹤檔案已經找到了該會話對應的作業系統程式(SPID),如果在其他情況下,我們如何找到這種狀態為'KILLED'
的作業系統程式號(SPID)呢?
下面給出了一個方法,可以借鑑:
SQL> l
1 SELECT s.username,s.status,
2 x.ADDR,x.KSLLAPSC,x.KSLLAPSN,x.KSLLASPO,x.KSLLID1R,x.KSLLRTYP,
3 decode(bitand (x.ksuprflg,2),0,null,1)
4 FROM x$ksupr x,v$session s
5 WHERE s.paddr(+)=x.addr
6 and bitand(ksspaflg,1)!=0
7* and s.sid=37
SQL> /

USERNAME STATUS ADDR KSLLAPSC KSLLAPSN KSLLASPO KSLLID1R KS D
------------------------------ -------- ---------------- ---------- ---------- ------------ ---------- -- -
PUBUSER KILLED C000000109C831E0 41 15 16243 17

SQL>


x$ksupr.ADDR列的值對應了V$PROCESS 中的ADDR的值,知道了這個SPID的地址,找到這個作業系統程式(SPID)就簡單了,例如:
SQL> select spid,pid from v$process where addr='C000000109C831E0';

SPID PID
------------ ----------
20552 26

SQL>

現在,我們只需要在作業系統上 kill 作業系統程式20552就可以了:
> ps -ef | grep 20552
ora9i 20552 1 0 Jan 8 ? 0:01 oraclecsmisc2 (LOCAL=NO)
ora9i 14742 14740 0 17:19:02 pts/ti 0:00 grep 20552
> kill -9 20552
> ps -ef | grep 20552
ora9i 14966 14964 0 17:40:01 pts/ti 0:00 grep 20552
>


再來檢查一下SID 37的資訊,我們看到這個會話是真的被kill掉了,
> exit

SQL> select sid,serial#,status,username,program from v$session where sid=37;

no rows selected

SQL> l
1 SELECT s.username,s.status,
2 x.ADDR,x.KSLLAPSC,x.KSLLAPSN,x.KSLLASPO,x.KSLLID1R,x.KSLLRTYP,
3 decode(bitand (x.ksuprflg,2),0,null,1)
4 FROM x$ksupr x,v$session s
5 WHERE s.paddr(+)=x.addr
6 and bitand(ksspaflg,1)!=0
7* and s.sid=37
SQL> /

no rows selected

SQL>

回到剛才hang住的會話,它已經恢復了正常操作,
並且我們已經得到了'ORA-04043: object CSNOZ629926699966 does not exist'這個正常的資訊:
SQL> desc CSNOZ629926699966


ERROR:
ORA-04043: object CSNOZ629926699966 does not exist


SQL>

在開一個會話,測試一把:
> sqlplus pubuser/pubuser

SQL*Plus: Release 9.2.0.4.0 - Production on Mon Jan 10 17:42:16 2005

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning and Real Application Clusters options
JServer Release 9.2.0.4.0 - Production

SQL> set timing on
SQL> desc CSNOZ629926699966
ERROR:
ORA-04043: object CSNOZ629926699966 does not exist


SQL>
當發出命令'desc CSNOZ629926699966'的時候,我們看到系統立刻返回了ORA-04043: object CSNOZ629926699966 does not exist'資訊,問題就此解決了。

[@more@]

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/18921899/viewspace-1017541/,如需轉載,請註明出處,否則將追究法律責任。

相關文章