【DB】徹底搞清楚library cache lock的成因和解決方法(一)
問題描述:
接到應用人員的報告,說是在任何對錶CSNOZ629926699966的操作都會hang,包括desc CSNOZ629926699966,
例如:
> sqlplus
SQL*Plus: Release 9.2.0.4.0 - Production on Mon Jan 10 10:11:06 2005
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning and Real Application Clusters options
JServer Release 9.2.0.4.0 - Production
SQL> conn pubuser/pubuser
Connected.
SQL> desc CSNOZ629926699966
。。。
這個程式 hang 了
。。。
詢問了一下之前有無特別的操作,業務人員說很久以前執行了指令碼,但是該教本執行很久都沒有結果,然後他就退出了會話,再之後,就出現了上面的情況。指令碼內容如下:
$ cat CSNOZ629926699966.sh
#!/bin/sh
sqlplus << EOF #use your username/password
create table CSNOZ629926699966 as select * from CSNOZ62992266cs
where mid not in ( select mid from where servid='020999011964' and status in ('A','B','S'));
exit;
$
$
$
$
解決過程:
> sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.4.0 - Production on Mon Jan 10 10:19:13 2005
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning and Real Application Clusters options
JServer Release 9.2.0.4.0 - Production
SQL> select * from v$lock where block=1;
no rows selected
SQL> /
no rows selected
SQL> /
no rows selected
SQL>
我們看到目前沒有鎖的資訊
SQL> select xidusn, object_id, session_id, locked_mode from v$locked_object;
。。。
XIDUSN OBJECT_ID SESSION_ID LOCKED_MODE
---------- ---------- ---------- -----------
14 18 37 3
。。。
SQL> /
。。。
XIDUSN OBJECT_ID SESSION_ID LOCKED_MODE
---------- ---------- ---------- -----------
14 18 37 3
。。。
SQL> /
。。。
XIDUSN OBJECT_ID SESSION_ID LOCKED_MODE
---------- ---------- ---------- -----------
14 18 37 3
。。。
SQL>
查詢 v$locked_object,我們發現了一個可疑的會話,SID 37:
SQL> select object_name,owner,object_type from dba_objects where object_id=18;
。。。 。。。
OBJECT_NAME OWNER OBJECT_TYPE
------------------------------ ------------------------------ ------------------
OBJ$ SYS TABLE
。。。 。。。
SQL>
奇怪怎麼一直有這個鎖??
初步猜測是由於SID為37的會話執行了上面的DDL語句,並在語句未完成前異常退出,
造成了所有訪問那個(DDL語句中涉及到的)物件的程式都hang了。
接下來我們看看等待事件:
SQL> select event,sid,p1,p2,p3 from v$session_wait where event not like 'SQL*%' and event not like 'rdbms%';
EVENT P1 P2 SID
---------------------------------------------------------------- ---------- ---------- ----------
pmon timer 300 0 1
ges remote message 32 0 4
gcs remote message 64 0 5
gcs remote message 64 0 7
smon timer 300 0 19
library cache lock 1.3835E+19 1.3835E+19 30
wakeup time manager 0 0 22
7 rows selected.
SQL> /
EVENT P1 P2 SID
---------------------------------------------------------------- ---------- ---------- ----------
pmon timer 300 0 1
ges remote message 32 0 4
gcs remote message 64 0 5
gcs remote message 64 0 7
smon timer 300 0 19
library cache lock 1.3835E+19 1.3835E+19 30
wakeup time manager 0 0 22
7 rows selected.
SQL> /
EVENT P1 P2 SID
---------------------------------------------------------------- ---------- ---------- ----------
pmon timer 300 0 1
ges remote message 32 0 4
gcs remote message 64 0 5
gcs remote message 64 0 7
smon timer 300 0 19
library cache lock 1.3835E+19 1.3835E+19 30
wakeup time manager 0 0 22
7 rows selected.
SQL> /
EVENT P1 P2 SID
---------------------------------------------------------------- ---------- ---------- ----------
pmon timer 300 0 1
ges remote message 32 0 4
gcs remote message 64 0 5
gcs remote message 64 0 7
smon timer 300 0 19
library cache lock 1.3835E+19 1.3835E+19 30
wakeup time manager 0 0 22
7 rows selected.
SQL>
我們注意到下面的事件:
EVENT P1 P2 SID
---------------------------------------------------------------- ---------- ---------- ----------
。。。
library cache lock 1.3835E+19 1.3835E+19 30
。。。
P1 是控制程式碼地址(handle address),也就是'library cache lock'發生的地址。
P2 是一個狀態物件,在這裡,它表示在物件上載入的鎖的地址(lock address)。
P1 和 P2都是科學計數發表示的10進位制數。
這些資訊再次證實了上面的猜測,SID 37阻塞了SID 30。
找出這兩個可疑程式的sid和serial,然後對他們設定10046事件:
SQL> select sid,serial# from v$session where sid in (30,37);
SID SERIAL#
---------- ----------
30 24167
37 2707
SQL> exec dbms_system.set_ev(30,24167,10046,12,'');
PL/SQL procedure successfully completed.
SQL> exec dbms_system.set_ev(37,2707,10046,12,'');
PL/SQL procedure successfully completed.
SQL>
跟蹤期間我們們再次測試一下,看看有沒有其他線索。
新開一個程式,找出其sid, serial和spid等資訊:
> sqlplus pubuser/pubuser
SQL*Plus: Release 9.2.0.4.0 - Production on Mon Jan 10 11:36:25 2005
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning and Real Application Clusters options
JServer Release 9.2.0.4.0 - Production
SQL> select distinct sid from v$mystat;
SID
----------
33
SQL> select sid,serial# from v$session where sid=33;
SID SERIAL#
---------- ----------
33 6639
SQL> SELECT SPID,PID FROM V$PROCESS WHERE ADDR=(SELECT PADDR FROM V$SESSION WHERE SID=37);
SPID PID
------------ ----------
20552 26
SQL> SELECT SPID,PID FROM V$PROCESS WHERE ADDR=(SELECT PADDR FROM V$SESSION WHERE SID=30);
SPID PID
------------ ----------
22580 28
SQL> show parameter dump
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
background_core_dump string partial
background_dump_dest string /ora9i/app/oracle/admin/csmisc
/bdump
core_dump_dest string /ora9i/app/oracle/admin/csmisc
/cdump
max_dump_file_size string UNLIMITED
shadow_core_dump string partial
user_dump_dest string /ora9i/app/oracle/admin/csmisc
/udump
SQL>
然後,再嘗試對 CSNOZ629926699966 表進行操作
SQL> desc CSNOZ629926699966
。。。
還是hang住了。
於是中斷這個操作(CTRL + C):
SQL> desc CSNOZ629926699966
ERROR:
ORA-01013: user requested cancel of current operation
SQL> select tname from tab where tname='CSNOZ629926699966';
no rows selected
SQL>
檢視PUBUSER使用者下的這個表,居然不存在!!
進一步證實了前面的猜測,也就是說會話37阻塞了其他所有操作表CSNOZ629926699966的會話,造成了程式的hang,當然,包括上面的SID 30和SID 33的DDL語句
現在,我們結束10046的事件跟蹤:
SQL> exec dbms_system.set_ev(30,24167,0,0,'');
PL/SQL procedure successfully completed.
SQL> exec dbms_system.set_ev(37,2707,0,0,'');
PL/SQL procedure successfully completed.
SQL>
根據上面記錄的資訊,我們知道這兩個會話產生的跟蹤資訊分別為:
SID為30的會話,產生的跟蹤檔案為:/ora9i/app/oracle/admin/csmisc/udump/csmisc2_ora_22580.trc
SID為37的會話,產生的跟蹤檔案為:/ora9i/app/oracle/admin/csmisc/udump/csmisc2_ora_20552.trc
看看trace檔案:
> cd /ora9i/app/oracle/admin/csmisc/udump
> ll -tlc
total 4432
-rw-r----- 1 ora9i dba 332995 Jan 10 12:00 csmisc2_ora_22580.trc
-rw-r----- 1 ora9i dba 3168 Jan 10 11:59 csmisc2_ora_20552.trc
-rw-r----- 1 ora9i dba 407133 Jan 7 15:10 csmisc2_ora_2708.trc
-rw-r----- 1 ora9i dba 640 Jan 7 14:48 csmisc2_ora_835.trc
-rw-r----- 1 ora9i dba 1590 Dec 30 22:50 csmisc2_ora_16244.trc
-rw-r----- 1 ora9i dba 1308403 Dec 30 22:44 csmisc2_ora_16033.trc
-rw-r----- 1 ora9i dba 616 Dec 28 14:16 csmisc2_ora_2176.trc
-rw-r----- 1 ora9i dba 644 Dec 8 18:22 csmisc2_ora_21083.trc
> mailx -s "csmisc2_ora_22580.trc" < csmisc2_ora_22580.trc
> mailx -s "csmisc2_ora_20552.trc" < csmisc2_ora_20552.trc
> exit
SQL>
我們看到SID為30的會話,產生的跟蹤檔案(csmisc2_ora_22580.trc)為的主要內容是:
/ora9i/app/oracle/admin/csmisc/udump/csmisc2_ora_22580.trc
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning and Real Application Clusters options
JServer Release 9.2.0.4.0 - Production
ORACLE_HOME = /ora9i/app/oracle/product/920
System name: HP-UX
Node name: cs_dc02
Release: B.11.11
Version: U
Machine: 9000/800
Instance name: csmisc2
Redo thread mounted by this instance: 2
Oracle process number: 28
Unix process pid: 22580, image: (TNS V1-V3)
*** 2005-01-10 11:31:49.416
*** SESSION ID:(30.24167) 2005-01-10 11:31:49.354
WAIT #0: nam='library cache lock' ela= 507258 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 505686 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507678 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507595 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507880 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507317 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507703 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507683 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 508265 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507100 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507684 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 505889 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507731 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507650 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507604 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
WAIT #0: nam='library cache lock' ela= 507698 p1=-4611686013547141416 p2=-4611686013691716064 p3=1301
。。。 。。。
我們看到SID 30的跟蹤檔案中的等待事件就是在V$SESSION_WAIT中看到的'library cache lock' .
再看看SID為37的會話,產生的跟蹤檔案(csmisc2_ora_20552.trc)為的主要內容是:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning and Real Application Clusters options
JServer Release 9.2.0.4.0 - Production
ORACLE_HOME = /ora9i/app/oracle/product/920
System name: HP-UX
Node name: cs_dc02
Release: B.11.11
Version: U
Machine: 9000/800
Instance name: csmisc2
Redo thread mounted by this instance: 2
Oracle process number: 26
Unix process pid: 20552, image: (TNS V1-V3)
*** 2005-01-10 11:33:22.702
*** SESSION ID:(37.2707) 2005-01-10 11:33:22.690
WAIT #1: nam='SQL*Net message to dblink' ela= 4 p1=675562835 p2=1 p3=0
*** 2005-01-10 11:35:07.452
WAIT #1: nam='SQL*Net message from dblink' ela= 102293555 p1=675562835 p2=1 p3=0
WAIT #1: nam='SQL*Net message to dblink' ela= 3 p1=675562835 p2=1 p3=0
*** 2005-01-10 11:36:55.980
WAIT #1: nam='SQL*Net message from dblink' ela= 105969709 p1=675562835 p2=1 p3=0
WAIT #1: nam='SQL*Net message to dblink' ela= 4 p1=675562835 p2=1 p3=0
*** 2005-01-10 11:39:05.416
WAIT #1: nam='SQL*Net message from dblink' ela= 126390826 p1=675562835 p2=1 p3=0
WAIT #1: nam='SQL*Net message to dblink' ela= 4 p1=675562835 p2=1 p3=0
*** 2005-01-10 11:41:12.878
WAIT #1: nam='SQL*Net message from dblink' ela= 124461520 p1=675562835 p2=1 p3=0
WAIT #1: nam='SQL*Net message to dblink' ela= 4 p1=675562835 p2=1 p3=0
*** 2005-01-10 11:43:01.285
WAIT #1: nam='SQL*Net message from dblink' ela= 105859385 p1=675562835 p2=1 p3=0
WAIT #1: nam='SQL*Net message to dblink' ela= 4 p1=675562835 p2=1 p3=0
*** 2005-01-10 11:44:48.200
WAIT #1: nam='SQL*Net message from dblink' ela= 104397696 p1=675562835 p2=1 p3=0
WAIT #1: nam='SQL*Net message to dblink' ela= 4 p1=675562835 p2=1 p3=0
。。。 。。。
現在我們來dump 系統狀態(systemstate),看看更詳細的資訊。
首先簡單的介紹一下 event systemstate。
很多人把 systemstate 事件理解為dump發生的那一刻的系統內所有程式的資訊,這是個錯誤的概念,事實上,
轉儲 system state 產生的跟蹤檔案是從dump那一刻開始到dump任務完成之間一段事件內的系統內所有程式的資訊。
dump systemstate產生的跟蹤檔案包含了系統中所有程式的程式狀態等資訊。每個程式對應跟蹤檔案中的一段內容,反映該程式的狀態資訊,包括程式資訊,會話資訊,enqueues資訊(主要是lock的資訊),緩衝區的資訊和該程式在SGA區中持有的(held)物件的狀態等資訊。
那麼通常在什麼情況下使用systemstate比較合適呢?
Oracle推薦的使用systemstate事件的幾種情況是:
- 資料庫 hang 住了
- 資料庫很慢
- 程式正在hang
- 資料庫出現某些錯誤
- 資源爭用
dump systemstate的語法為:
ALTER SESSION SET EVENTS 'immediate trace name systemstate level 10';
也可以使用ORADEBUG實現這個功能
ORADEBUG DUMP SYSTEMSTATE level 10
如果希望在發生某種錯誤時除非systemstate事件,可以在引數檔案(spfile或者pfile)中設定event引數,
例如,當系統發生死鎖(出現ORA-00060錯誤)時dump systemstate:
event = "60 trace name systemstate level 10"
言歸正傳,我們dump系統狀態:
SQL> ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME SYSTEMSTATE LEVEL 8';
Session altered.
SQL> host
>cd /ora9i/app/oracle/admin/csmisc/udump
> ll -ctl
-rw-r----- 1 ora9i dba 1070863 Jan 10 13:02 csmisc2_ora_22580.trc
-rw-r----- 1 ora9i dba 1345368 Jan 10 13:01 csmisc2_ora_22568.trc
-rwxrwxrwx 1 ora9i dba 44114 Jan 10 12:52 ass1015.awk
-rw-r----- 1 ora9i dba 407133 Jan 7 15:10 csmisc2_ora_2708.trc
-rw-r----- 1 ora9i dba 640 Jan 7 14:48 csmisc2_ora_835.trc
-rw-r----- 1 ora9i dba 1590 Dec 30 22:50 csmisc2_ora_16244.trc
-rw-r----- 1 ora9i dba 1308403 Dec 30 22:44 csmisc2_ora_16033.trc
-rw-r----- 1 ora9i dba 616 Dec 28 14:16 csmisc2_ora_2176.trc
-rw-r----- 1 ora9i dba 644 Dec 8 18:22 csmisc2_ora_21083.trc
>
> mailx -s "22568" < csmisc2_ora_22568.trc
這個跟蹤檔案很大(因為它包含了所有程式的資訊),那麼我們從哪裡開始看起呢?
首先,透過在跟蹤檔案中查詢字串"waiting for 'library cache lock'",我們找到了被阻塞程式的資訊:
PROCESS 28: ----------------被阻塞的Oracle程式,這裡PROCESS 28對應了V$PROCESS中的PID的值,
也就是說我們可以根據這一資訊在V$PROCESS和V$SESSION找到被阻塞的會話的資訊
----------------------------------------
SO: c000000109c83bf0, type: 2, owner: 0000000000000000, flag: INIT/-/-/0x00
(process) Oracle pid=28, calls cur/top: c00000010b277890/c00000010b277890, flag: (0) -
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 17 24 6
last post received-location: ksusig
last process to post me: c000000109c840f8 25 0
last post sent: 0 0 15
last post sent-location: ksasnd
last process posted by me: c000000109c7ff90 1 6
(latch info) wait_event=0 bits=0
Process Group: DEFAULT, pseudo proc: c000000109eefda0
O/S info: user: ora9i, term: pts/th, ospid: 22580 ----------------該程式的作業系統程式號,對應於V$PROCESS中的SPID
OSD pid info: Unix process pid: 22580, image: (TNS V1-V3)
----------------------------------------
SO: c000000109f02c68, type: 4, owner: c000000109c83bf0, flag: INIT/-/-/0x00
(session) trans: 0000000000000000, creator: c000000109c83bf0, flag: (100041) USR/- BSY/-/-/-/-/-
DID: 0002-001C-00000192, short-term DID: 0000-0000-00000000
txn branch: 0000000000000000
oct: 0, prv: 0, sql: c00000011f8ea068, psql: c00000011f8ea068, user: 50/PUBUSER
O/S info: user: ora9i, term: , ospid: 22536, machine: cs_dc02
program: (TNS V1-V3)
application name: SQL*Plus, hash value=3669949024
waiting for 'library cache lock' blocking sess=0x0 seq=18589 wait_time=0
handle address=c000000122e2a6d8, lock address=c00000011a449e20, 100*mode+namespace=515
。。。 。。。
SO: c00000010b277890, type: 3, owner: c000000109c83bf0, flag: INIT/-/-/0x00
(call) sess: cur c000000109f02c68, rec 0, usr c000000109f02c68; depth: 0
----------------------------------------
SO: c00000011a449e20, type: 51, owner: c00000010b277890, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=c00000011a449e20 handle=c000000122e2a6d8 request=S
call pin=0000000000000000 session pin=0000000000000000
htl=c00000011a449e90[c00000011a4bc350,c00000011a4bc350] htb=c00000011a4bc350
user=c000000109f02c68 session=c000000109f02c68 count=0 flags=[00] savepoint=463
the rest of the object was already dumped
。。。 。。。
請注意下面的資訊:
waiting for 'library cache lock' blocking sess=0x0 seq=18589 wait_time=0
handle address=c000000122e2a6d8, lock address=c00000011a449e20, 100*mode+namespace=515
這段資訊告訴我們ORACLE PID為 28的程式(PROCESS 28),正在等待'library cache lock' ,透過‘handle address=c000000122e2a6d8’我們可以找到阻塞它的會話的ORACLE PID資訊。
還要注意這段資訊:
LIBRARY OBJECT LOCK: lock=c00000011a449e20 handle=c000000122e2a6d8 request=S
call pin=0000000000000000 session pin=0000000000000000
htl=c00000011a449e90[c00000011a4bc350,c00000011a4bc350] htb=c00000011a4bc350
user=c000000109f02c68 session=c000000109f02c68 count=0 flags=[00] savepoint=463
這裡就是阻塞PROCESS 28程式的會話的資訊。
簡單的記住這個依據的要點是:
waiting session的'handle address'的值對應於blocking session的'handle'的值。
回過頭來,看看這個值,它應於上面我們在V$SESSION_WAIT中看到的P1和P2的值:
SQL> select to_number('C000000122E2A6D8','XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX') from dual;
TO_NUMBER('C000000122E2A6D8','XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')
----------------------------------------------------------------
1.3835E+19
SQL>
問題的成因已經基本上明確了,這裡推薦兩種解決問題的方法:
方法1,根據 c000000122e2a6d8 地址,我們可以得到當前在library cache中相應的鎖資訊:
SQL> l
1 select INST_ID,USER_NAME,KGLNAOBJ,KGLLKSNM,KGLLKUSE,KGLLKSES,KGLLKMOD,KGLLKREQ,KGLLKPNS,KGLLKHDL
2* from X$KGLLK where KGLLKHDL = 'C000000122E2A6D8' order by KGLLKSNM,KGLNAOBJ
SQL> /
INST_ID USER_NAME KGLNAOBJ KGLLKSNM KGLLKUSE KGLLKSES KGLLKMOD KGLLKREQ KGLLKPNS KGLLKHDL
---------- ------------- ---------------------- ---------- ---------------- ---------------- ---------- ---------- ---------------- ----------------
2 PUBUSER CSNOZ629926699966 30 C000000109F02C68 C000000109F02C68 0 200 C000000122E2A6D8
2 PUBUSER CSNOZ629926699966 37 C000000108C99E28 C000000108C99E28 3 000 C000000122E2A6D8
SQL>
按照Oracle推薦的做法,我們現在應該使用'alter system kill session'命令kill掉SID 37,
結果得到了ORA-00031錯誤:
SQL> alter system kill session '37,2707';
alter system kill session '37,2707'
*
ERROR at line 1:
ORA-00031: session marked for kill
SQL>
檢查SID 37的狀態:
SQL> set linesize 150
SQL> col program for a50
SQL> select sid,serial#,status,username,program from v$session where sid=37;
SID SERIAL# STATUS USERNAME PROGRAM
---------- ---------- -------- ------------------------------ --------------------------------------------------
37 2707 KILLED PUBUSER (TNS V1-V3)
SQL>
再次證實了我們最初的想法—— 有人在執行了某個需要執行很久的DDL(多數是語句效率低,當然不排除遭遇bug的可能),
然後沒等語句結束就異常退出了會話。
這個例子中我們在上面的跟蹤檔案已經找到了該會話對應的作業系統程式(SPID),如果在其他情況下,我們如何找到這種狀態為'KILLED'
的作業系統程式號(SPID)呢?
下面給出了一個方法,可以借鑑:
SQL> l
1 SELECT s.username,s.status,
2 x.ADDR,x.KSLLAPSC,x.KSLLAPSN,x.KSLLASPO,x.KSLLID1R,x.KSLLRTYP,
3 decode(bitand (x.ksuprflg,2),0,null,1)
4 FROM x$ksupr x,v$session s
5 WHERE s.paddr(+)=x.addr
6 and bitand(ksspaflg,1)!=0
7* and s.sid=37
SQL> /
USERNAME STATUS ADDR KSLLAPSC KSLLAPSN KSLLASPO KSLLID1R KS D
------------------------------ -------- ---------------- ---------- ---------- ------------ ---------- -- -
PUBUSER KILLED C000000109C831E0 41 15 16243 17
SQL>
x$ksupr.ADDR列的值對應了V$PROCESS 中的ADDR的值,知道了這個SPID的地址,找到這個作業系統程式(SPID)就簡單了,例如:
SQL> select spid,pid from v$process where addr='C000000109C831E0';
SPID PID
------------ ----------
20552 26
SQL>
現在,我們只需要在作業系統上 kill 作業系統程式20552就可以了:
> ps -ef | grep 20552
ora9i 20552 1 0 Jan 8 ? 0:01 oraclecsmisc2 (LOCAL=NO)
ora9i 14742 14740 0 17:19:02 pts/ti 0:00 grep 20552
> kill -9 20552
> ps -ef | grep 20552
ora9i 14966 14964 0 17:40:01 pts/ti 0:00 grep 20552
>
再來檢查一下SID 37的資訊,我們看到這個會話是真的被kill掉了,
> exit
SQL> select sid,serial#,status,username,program from v$session where sid=37;
no rows selected
SQL> l
1 SELECT s.username,s.status,
2 x.ADDR,x.KSLLAPSC,x.KSLLAPSN,x.KSLLASPO,x.KSLLID1R,x.KSLLRTYP,
3 decode(bitand (x.ksuprflg,2),0,null,1)
4 FROM x$ksupr x,v$session s
5 WHERE s.paddr(+)=x.addr
6 and bitand(ksspaflg,1)!=0
7* and s.sid=37
SQL> /
no rows selected
SQL>
回到剛才hang住的會話,它已經恢復了正常操作,
並且我們已經得到了'ORA-04043: object CSNOZ629926699966 does not exist'這個正常的資訊:
SQL> desc CSNOZ629926699966
ERROR:
ORA-04043: object CSNOZ629926699966 does not exist
SQL>
在開一個會話,測試一把:
> sqlplus pubuser/pubuser
SQL*Plus: Release 9.2.0.4.0 - Production on Mon Jan 10 17:42:16 2005
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning and Real Application Clusters options
JServer Release 9.2.0.4.0 - Production
SQL> set timing on
SQL> desc CSNOZ629926699966
ERROR:
ORA-04043: object CSNOZ629926699966 does not exist
SQL>
當發出命令'desc CSNOZ629926699966'的時候,我們看到系統立刻返回了ORA-04043: object CSNOZ629926699966 does not exist'資訊,問題就此解決了。
這裡,簡單的介紹一下X$KGLLK,這個基表儲存了庫快取中物件的鎖的資訊,它對於解決這類問題特別有用,其名稱的含義如下:
[K]ernel Layer
[G]eneric Layer
[L]ibrary Cache Manager ( defined and mapped from kqlf )
Object Locks
X$KGLLK - Object [L]oc[K]s
KGLNAOBJ 列包含了在librarky cache中的物件上執行命令的語句的前80個字元(其實從這裡我們也可以大大縮小範圍了)
X$KGLLK.KGLLKUSE 和 x$kgllk.KGLLKSES 對應於跟蹤檔案中的owner的值
X$KGLLK.KGLLKADR
X$KGLLK.KGLLKHDL 對應於跟蹤檔案中的handle的值(handle=C000000122E2A6D8),也就是'library cache lock'的地址
X$KGLLK.KGLLKPNS 對應於跟蹤檔案中的session pin的值
X$KGLLK.KGLLKSPN對應於跟蹤檔案中的savepoint的值
我們再來看一下更全面的資訊:
SQL> set linesize 2000
SQL> select * from X$KGLLK where KGLLKHDL = 'C000000122E2A6D8' order by KGLLKSNM,KGLNAOBJ
2 /
ADDR INDX INST_ID KGLLKADR KGLLKUSE KGLLKSES KGLLKSNM KGLLKHDL KGLLKPNC KGLLKPNS KGLLKCNT KGLLKMOD KGLLKREQ KGLLKFLG KGLLKSPN KGLLKHTB KGLNAHSH KGLHDPAR KGLHDNSP USER_NAME KGLNAOBJ
---------------- ---------- ---------- ---------------- ---------------- ---------------- ---------- ---------------- ---------------- ---------------- ---------- ---------- ---------- ---------- ---------- ---------------- ---------- ---------------- ---------- ------------------------------ ------------------------------------------------------------
800003FB0007E4D0 33 2 C00000011A449E20 C000000109F02C68 C000000109F02C68 30 C000000122E2A6D8 00 00 0 0 2 0 463 C00000011A4BC350 3990848181 C000000122E2A6D8 1 PUBUSER CSNOZ629926699966
800003FB0007E5B0 34 2 C00000011A44A150 C000000108C99E28 C000000108C99E28 37 C000000122E2A6D8 00 00 1 3 0 0 179 C00000011A4BB328 3990848181 C000000122E2A6D8 1 PUBUSER CSNOZ629926699966
SQL> set linesize 100
SQL> l
1* select * from X$KGLLK where KGLLKHDL = 'C000000122E2A6D8' order by KGLLKSNM,KGLNAOBJ
SQL> /
ADDR INDX INST_ID KGLLKADR KGLLKUSE KGLLKSES KGLLKSNM
---------------- ---------- ---------- ---------------- ---------------- ---------------- ----------
KGLLKHDL KGLLKPNC KGLLKPNS KGLLKCNT KGLLKMOD KGLLKREQ KGLLKFLG
---------------- ---------------- ---------------- ---------- ---------- ---------- ----------
KGLLKSPN KGLLKHTB KGLNAHSH KGLHDPAR KGLHDNSP USER_NAME
---------- ---------------- ---------- ---------------- ---------- ------------------------------
KGLNAOBJ
------------------------------------------------------------
800003FB0007E4D0 33 2 C00000011A449E20 C000000109F02C68 C000000109F02C68 30
C000000122E2A6D8 00 00 0 0 2 0
463 C00000011A4BC350 3990848181 C000000122E2A6D8 1 PUBUSER
CSNOZ629926699966
800003FB0007E5B0 34 2 C00000011A44A150 C000000108C99E28 C000000108C99E28 37
C000000122E2A6D8 00 00 1 3 0 0
179 C00000011A4BB328 3990848181 C000000122E2A6D8 1 PUBUSER
CSNOZ629926699966
SQL>
原文地址:http://blog.csdn.net/lunar2000/article/details/247578
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29487349/viewspace-2128013/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 【DB】徹底搞清楚library cache lock的成因和解決方法(三)
- 【DB】徹底搞清楚library cache lock的成因和解決方法(二)
- 徹底搞清楚library cache lock的成因和解決方法
- 徹底搞清楚library cache lock的成因和解決方法(轉)
- 徹底搞清楚library cache lock的成因和解決方法(1)
- 徹底搞清楚library cache lock的成因和解決方法(2)
- 徹底搞清楚library cache lock的成因和解決方法(3)
- 徹底搞清楚library cache lock的成因和解決方法(5)
- 徹底搞清楚library cache lock的成因和解決方法(4)
- 徹底搞清楚library cache lock的成因和解決方法(6)
- 徹底搞清楚library cache lock的成因和解決方法(8)
- 徹底搞清楚library cache lock的成因和解決方法(7)
- 徹底搞清楚library cache lock的成因和解決方法(三)
- 定位Library Cache pin,Library Cache lock等待的解決方法
- 常用定位library cache lock的方法
- 【ASM_ORACLE】Library Cache最佳化篇(二)Library cache load lock的概念和解決辦法ASMOracle
- [Oracle]--Library cache lock 故障解決一例Oracle
- 'library cache lock'等待事件的處理方法事件
- library cache pin/lock的解決辦法
- zt_如何平面解決library cache lock和library cache pin
- Library Cache Pin 及 Library Cache Lock分析
- library cache lock和library cache pin理解
- RAC 環境Library Cache Lock的處理方法
- library cache pin和library cache lock的診斷分析
- LIBRARY CACHE LOCK 等待事件事件
- RAC環境Library Cache Lock的處理方法(zt)
- library cache lock和library cache pin區別總結
- 一次library cache lock 問題分析
- oracle library cache之library cache lock_library cache pin wait event釋義OracleAI
- latch:library cache lock等待事件事件
- oracle異常:library cache lockOracle
- 一文徹底搞清楚 Material DesignMaterial Design
- 短連線 引起的 library cache lock
- 查詢library cache lock的源頭
- enq:Library cache lock/pin等待事件ENQ事件
- library cache lock 阻塞程式查詢
- Library cache lock/pin詳解(轉)
- LIBRARY CACHE LOCK WAITS AND NO BLOCKER FOUNDAIBloC