一個關於latch: library cache事件的處理

denglt發表於2010-11-19

    昨天發現有個JOB等待'latch free'事件無法正常完成,跟蹤後沒有發現問題的本質,只好先kill了,等明天再看。今天該JOB又無法正常完成,等待事件變為了‘latch: library cache’。既然是latch的問題,那就從latch開始查了。

SQL> SELECT SID, EVENT, P1TEXT, P1RAW, P2TEXT, P2 FROM gV$SESSION_WAIT
  2    where sid = 3493
  3  ;
 
SQL> SELECT SID, EVENT, P1TEXT, P1RAW, P2TEXT, P2 FROM gV$SESSION_WAIT
  2    where sid = 3493
  3  ;
 
       SID EVENT                       P1TEXT     P1RAW            P2TEXT          P2
---------- ------------------------------------------------------- ------------------
      3493 latch: shared pool          address    0700000010104710 number         214
      3493 SQL*Net message from client driver id  0000000054435000 #bytes           1

SQL> SELECT addr,latch#,level#,name,gets,misses,sleeps,spin_gets,wait_time FROM V$LATCH WHERE LATCH# = 214;
 
ADDR                 LATCH#     LEVEL# NAME               GETS     MISSES     SLEEPS  SPIN_GETS  WAIT_TIME
---------------- ---------- ---------- ------------ ---------- ---------- ---------- ---------- ----------
07000000100225C0        214          7 shared pool  2977196635   10539431     366741   10274915 2780343584

SQL> select addr,latch#,level#,name,gets,misses,sleeps,spin_gets,wait_time from  V$LATCH_CHILDREN where latch#=214;                            
                                                                                                              
ADDR                 LATCH#     LEVEL# NAME                GETS     MISSES     SLEEPS  SPIN_GETS  WAIT_TIME
---------------- ---------- ---------- ------------- ---------- ---------- ---------- ---------- ----------
07000000101043F0        214          7 shared pool    421219359    2340922      68591    2281497 1512179273
0700000010104490        214          7 shared pool    439710623    1279212      32188    1256923 2833823726
0700000010104530        214          7 shared pool    465101568    1449774      34428    1426738 3533376605
07000000101045D0        214          7 shared pool    420849465    2161907      98483    2096577 9048424929
0700000010104670        214          7 shared pool    402154849    1404051      48925    1369036 4024667480
0700000010104710        214          7 shared pool    419586010     603551      41558     573945 3253092995
07000000101047B0        214          7 shared pool    409803261    1301894      42599    1272048 3598305121

SQL>  SELECT *                                                                                                                                             
  2     FROM (SELECT PARENT_NAME,                                                                                                                          
  3                  "WHERE",                                                                                                                              
  4                  SLEEP_COUNT,                                                                                                                          
  5                  WTR_SLP_COUNT,                                                                                                                        
  6                  LONGHOLD_COUNT                                                                                                                        
  7             FROM V$LATCH_MISSES                                                                                                                        
  8            WHERE PARENT_NAME = 'shared pool'                                                                                                           
  9            ORDER BY SLEEP_COUNT + WTR_SLP_COUNT + LONGHOLD_COUNT DESC)                                                                                 
 10    WHERE ROWNUM < 20;                                                                                                                                  
                                                                                                                                                           
PARENT_NAME    WHERE                               SLEEP_COUNT WTR_SLP_COUNT LONGHOLD_COUNT
-------------- ----------------------------------- ----------- ------------- --------------
shared pool    kghupr1                                   88053        144031              0
shared pool    kghalo                                    50842        121930              0
shared pool    kghfre                                    48683         71260              0
shared pool    kghfrunp: alloc: wait                     75222           162              0
shared pool    kghdmp                                    74347             0              0
shared pool    kghfrunp: clatch: wait                    42911          8705              0
shared pool    kghalp                                    16459         24480              0
shared pool    kghfrunp: clatch: nowait                  29795             0              0
shared pool    kghfrunp: alloc: session dur               5959          1422              0
shared pool    kghasp                                     1188          2081              0
shared pool    kgh: quiesce extents                       1854             3              0
shared pool    kghfrh                                     1520           246              0
shared pool    kgh: add extent to quiesced list           1281            10              0
shared pool    kghfree_extents: scan                        38           649              0
shared pool    kghquiesce: clatch: nowait                  557             0              0
shared pool    kgh: sim resz update                        411             4              0
shared pool    kgh: add extent to reserved list            208           189              0
shared pool    kghfrunp: alloc: cursor dur                 362             1              0
shared pool    kghfen: not perm alloc class                 70           138              0
                                                
到這兒我都有點暈乎了。

再進行session trace,看看JOB到底在幹什麼

SQL> begin                                                      
  2    sys.dbms_system.set_sql_trace_in_session(3493,58726,true);                                                                                                                             
  3  end;                                                       
  4  /                                                          
                                                                
PL/SQL procedure successfully completed      
                       
                                                                
SQL> select paddr from v$session where sid=3493;                
                                                                
PADDR                                                           
----------------                                                
0700000EC684A3D0                                                
                                                                
SQL> select spid from v$process where addr ='0700000EC684A3D0'; 
                                                                
SPID                                                            
------------                                                    
4993518                                                         
                                     
由於這是JOB,在bdump資料夾下找到如下檔案(一般的session跟蹤檔案在udump下)
gxdb1_j001_4993518.trc  

使用tkprof進行分析得到:

The following statement encountered a error during parse:

select count(distinct(m.portid)) from cab_mdfport m, ibss_service_port s  where m.cableid = 1360000000838 and m.portid = s.portid and  (m.lineorder between  and 0)

Error encountered: ORA-00936
********************************************************************************
Trace file: D:\當前處理\Oracle\gxdb1_j001_4993518.trc
Trace file compatibility: 10.01.00
Sort options: default

       1  session in tracefile.
       0  user  SQL statements in trace file.
       0  internal SQL statements in trace file.
       0  SQL statements in trace file.
       0  unique SQL statements in trace file.
   12088  lines in trace file.
       0  elapsed seconds in trace file.

這時很明顯了資料庫一直在parse'select count(distinct(m.portid)) from cab_mdfport m, ibss_service_port s  where m.cableid = 1360000000838 and m.portid = s.portid and  (m.lineorder between  and 0)
'不成功。
開啟JOB的過程看,發現裡面有很多的動態拼裝的SQL。
現在很清楚了,是動態拼裝的SQL不對。
讓開發修改後,JOB成功執行通過。                                                             

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/195110/viewspace-678502/,如需轉載,請註明出處,否則將追究法律責任。

相關文章