一個關於latch: library cache事件的處理

昨天發現有個JOB等待'latch free'事件無法正常完成，跟蹤後沒有發現問題的本質，只好先kill了，等明天再看。今天該JOB又無法正常完成，等待事件變為了‘latch: library cache’。既然是latch的問題，那就從latch開始查了。

SQL> SELECT SID, EVENT, P1TEXT, P1RAW, P2TEXT, P2 FROM gV$SESSION_WAIT
2    where sid = 3493
3 ;

SQL> SELECT SID, EVENT, P1TEXT, P1RAW, P2TEXT, P2 FROM gV$SESSION_WAIT
2    where sid = 3493
3 ;

       SID EVENT                       P1TEXT     P1RAW            P2TEXT          P2
---------- ------------------------------------------------------- ------------------
      3493 latch: shared pool          address    0700000010104710 number         214
      3493 SQL*Net message from client driver id 0000000054435000 #bytes           1

SQL> SELECT addr,latch#,level#,name,gets,misses,sleeps,spin_gets,wait_time FROM V$LATCH WHERE LATCH# = 214;

ADDR LATCH# LEVEL# NAME GETS MISSES SLEEPS SPIN_GETS WAIT_TIME
---------------- ---------- ---------- ------------ ---------- ---------- ---------- ---------- ----------
07000000100225C0 214 7 shared pool 2977196635 10539431 366741 10274915 2780343584

SQL> select addr,latch#,level#,name,gets,misses,sleeps,spin_gets,wait_time from V$LATCH_CHILDREN where latch#=214;

ADDR                 LATCH#     LEVEL# NAME                GETS     MISSES     SLEEPS SPIN_GETS WAIT_TIME
---------------- ---------- ---------- ------------- ---------- ---------- ---------- ---------- ----------
07000000101043F0        214          7 shared pool    421219359    2340922      68591    2281497 1512179273
0700000010104490        214          7 shared pool    439710623    1279212      32188    1256923 2833823726
0700000010104530        214          7 shared pool    465101568    1449774      34428    1426738 3533376605
07000000101045D0        214          7 shared pool    420849465    2161907      98483    2096577 9048424929
0700000010104670        214          7 shared pool    402154849    1404051      48925    1369036 4024667480
0700000010104710        214          7 shared pool    419586010     603551      41558     573945 3253092995
07000000101047B0        214          7 shared pool    409803261    1301894      42599    1272048 3598305121

SQL> SELECT *
2     FROM (SELECT PARENT_NAME,
3                  "WHERE",
4                  SLEEP_COUNT,
5                  WTR_SLP_COUNT,
6                  LONGHOLD_COUNT
7             FROM V$LATCH_MISSES
8            WHERE PARENT_NAME = 'shared pool'
9            ORDER BY SLEEP_COUNT + WTR_SLP_COUNT + LONGHOLD_COUNT DESC)
10    WHERE ROWNUM < 20;

PARENT_NAME    WHERE                               SLEEP_COUNT WTR_SLP_COUNT LONGHOLD_COUNT
-------------- ----------------------------------- ----------- ------------- --------------
shared pool    kghupr1                                   88053        144031              0
shared pool    kghalo                                    50842        121930              0
shared pool    kghfre                                    48683         71260              0
shared pool    kghfrunp: alloc: wait                     75222           162              0
shared pool    kghdmp                                    74347             0              0
shared pool    kghfrunp: clatch: wait                    42911          8705              0
shared pool    kghalp                                    16459         24480              0
shared pool    kghfrunp: clatch: nowait                  29795             0              0
shared pool    kghfrunp: alloc: session dur               5959          1422              0
shared pool    kghasp                                     1188          2081              0
shared pool    kgh: quiesce extents                       1854             3              0
shared pool    kghfrh                                     1520           246              0
shared pool    kgh: add extent to quiesced list           1281            10              0
shared pool    kghfree_extents: scan                        38           649              0
shared pool    kghquiesce: clatch: nowait                  557             0              0
shared pool    kgh: sim resz update                        411             4              0
shared pool    kgh: add extent to reserved list            208           189              0
shared pool    kghfrunp: alloc: cursor dur                 362             1              0
shared pool    kghfen: not perm alloc class                 70           138              0

到這兒我都有點暈乎了。

再進行session trace,看看JOB到底在幹什麼

SQL> begin
2    sys.dbms_system.set_sql_trace_in_session(3493,58726,true);
3 end;
4 /

PL/SQL procedure successfully completed


SQL> select paddr from v$session where sid=3493;

PADDR
----------------
0700000EC684A3D0

SQL> select spid from v$process where addr ='0700000EC684A3D0';

SPID
------------
4993518

由於這是JOB，在bdump資料夾下找到如下檔案（一般的session跟蹤檔案在udump下）
gxdb1_j001_4993518.trc

使用tkprof進行分析得到：

The following statement encountered a error during parse:

select count(distinct(m.portid)) from cab_mdfport m, ibss_service_port s where m.cableid = 1360000000838 and m.portid = s.portid and (m.lineorder between and 0)

Error encountered: ORA-00936
********************************************************************************
Trace file: D:\當前處理\Oracle\gxdb1_j001_4993518.trc
Trace file compatibility: 10.01.00
Sort options: default

       1 session in tracefile.
       0 user SQL statements in trace file.
       0 internal SQL statements in trace file.
       0 SQL statements in trace file.
       0 unique SQL statements in trace file.
   12088 lines in trace file.
       0 elapsed seconds in trace file.

這時很明顯了資料庫一直在parse'select count(distinct(m.portid)) from cab_mdfport m, ibss_service_port s where m.cableid = 1360000000838 and m.portid = s.portid and (m.lineorder between and 0)
'不成功。
開啟JOB的過程看，發現裡面有很多的動態拼裝的SQL。
現在很清楚了，是動態拼裝的SQL不對。
讓開發修改後，JOB成功執行通過。

來自 “ ITPUB部落格 ” ，連結：http://blog.itpub.net/195110/viewspace-678502/，如需轉載，請註明出處，否則將追究法律責任。

一個關於latch: library cache事件的處理

相關文章