latch: row cache objects 和cursor: pin S wait on X共同出現

gaopengtttt發表於2013-11-13

 Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 5980 11-Nov-13 16:00:37 1403  39.6
End Snap: 5981 11-Nov-13 16:31:29 1422  42.8
Elapsed:    30.86 (mins)   
DB Time:    3,660.60 (mins)   


Per Second Per Transaction
Redo size:  1,185,514.75  6,235.28
Logical reads:  683,361.22  3,594.17
Block changes:  2,972.67  15.63
Physical reads:  6,201.95  32.62
Physical writes:  621.67  3.27
User calls:  8,220.82  43.24
Parses:  2,793.81  14.69
Hard parses:  319.52  1.68
Sorts:  1,180.36  6.21
Logons:  12.61  0.07
Executes:  31,893.98  167.75
Transactions:  190.13 

Top 5 Timed Events

Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
latch: row cache objects  6,500,460  56,331  9  25.6 Concurrency
CPU time    50,195    22.9 
cursor: pin S wait on X  2,073,123  40,372  19  18.4 Concurrency
db file sequential read  5,758,293  17,702  3  8.1 User I/O
enq: TX - row lock contention  6,570  12,384  1,885  5.6 Application

可以看到DB_TIME比較高系統負載比較高,而等待集中在前3位,
我們先說
latch: row cache objects
根據v$latch_misses或者AWRRPT中可以看到其主要集中的位置為
row cache objects kqrpre: find obj 0 2,922,441 2,564,411
row cache objects kqreqd: reget 0 1,873,773 1,905,652
row cache objects kqreqd 0 1,682,121 2,008,663
同時可以看到其操作的物件是dc_object_ids
到底怎麼檢視其可以根據文件:
WAITEVENT: "latch: row cache objects" Reference Note (Doc ID 1550722.1)
我們同事糾結於DC_OBJECT_IDS到底什麼時候訪問,但是我覺得不管怎麼說一定是過多的訪問了資料字典我們可以從其位置
row cache objects kqrpre: find obj,row cache objects kqreqd: reget,row cache objects kqreqd進行入手
metalink有一個相關的說明
文件源:Contention on 'row cache objects' Latch After Upgrade from 10.2 to 11.2.0.3 (Doc ID 1485410.1)
The 'row cache objects' latch is called most frequently from 'kqrpre: find obj'. This module tries to find details of an object being parsed in the row cache. During parse, the row cache is searched. The process searches through a linked list protected by the 'row cache objects' latch to find the object in the cache that it needs. When other processes are also parsing and looking through the row cache this may cause contention; especially if the parse activity is excessive or unnecessary.

Top 5 Timed Foreground Events

Event                    Waits      Time(s) Avg wait (ms)  DB time Wait Class
latch: row cache objects 58,291,873 130,700             2    63.18 Concurrency
DB CPU                       60,563   29.27  
db file sequential read   2,994,299  14,461             5     6.99 User I/O
library cache lock            3,179   1,662           523     0.80 Concurrency
gc cr grant 2-way         1,233,503     981             1     0.47 Cluster

 Latch Name        Where	         NoWait Misses Sleeps     Waiter Sleeps
...
 row cache objects kqrpre: find obj                  0 10,332,326     8,906,124
 row cache objects kqreqd: reget                     0  7,888,165     8,503,625
 row cache objects kqreqd                            0  7,677,074     8,489,832
 row cache objects kqrso                             0      4,127         3,622
 row cache objects kqrpre: init complete             0        143            88


當然我們沒有升級但是可以借鑑,其解決方案為:
1) Rewrite the application to use bind variables to lower the hard parse rate from 350/second
2) Enable the cursor_sharing = force
3) Do not use first_rows_x optimization and use all_rows instead
4) Set optimizer_feature_enabled = 10.2.0.4
5) Set event 10089 level 1 to disable index sorting.

視乎前2項都明確的說明來自硬解析的問題,而且給出了一個定量的值350,而我們這裡是314。順便說下此文件其列子如下:
 Latch Name        Where          NoWait Misses Sleeps     Waiter Sleeps
 row cache objects kqrpre: find obj                  0 10,332,326     8,906,124
 row cache objects kqreqd: reget                     0  7,888,165     8,503,625
 row cache objects kqreqd                            0  7,677,074     8,489,832
 row cache objects kqrso                             0      4,127         3,622
 row cache objects kqrpre: init complete             0        143            88
 他這裡更加明顯。但是我們這裡畢竟沒有超過350每秒的硬解析,所以好一些。
 然後我們再說
cursor: pin S wait on X 
這裡參考的一個網上的文件,我相信這個文件也來自於METALINK如下:
cursor: pin S整體描述
cursor: pin S A session waits on this event when it wants to update a shared mutex pin and another session
is currently in the process of updating a shared mutex pin for the same cursor object. This wait event should
rarely be seen because a shared mutex pin update is very fast.(Wait Time: Microseconds)
 
--Parameter說明
P1 Hash value of cursor
 
P2 Mutex value
64 bit platforms
8 bytes are used.
Top 4 bytes hold the session id (if the mutex is held X)
Bottom 4 bytes hold the ref count (if the mutex is held S).
 
32 bit platforms
4 bytes are used.
Top 2 bytes hold the session id (if the mutex is held X)
Bottom 2 bytes hold the ref count (if the mutex is held S).
 
P3 Mutex where (an internal code locator) OR'd with Mutex Sleeps
 
--查詢sql
SELECT a.*, s.sql_text
  FROM v$sql s,
       (SELECT sid,
               event,
               wait_class,
               p1 cursor_hash_value,
               p2raw Mutex_value,
               TO_NUMBER (SUBSTR (p2raw, 1, 8), 'xxxxxxxx') hold_mutex_x_sid
          FROM v$session_wait
         WHERE event LIKE 'cursor%') a
 WHERE s.HASH_VALUE = a.cursor_hash_value
 
cursor: pin S wait on X描述
- In previous versions of Oracle, library cache pin is protected by “library cache pin latch”.
- But in recent versions of Oracle(I believe it’s 10.2.0.2),
  library cache pin for the cursor LCO is protected by mutext.
- Mutex is allocated per LCO, so it enables fine-grained access control.
 
“cursor: pin S wait on X” wait event is mostly related to mutex and hard parse.
- When a process hard parses the SQL statement, it should acquire exclusive
  library cache pin for the corresponding LCO.
- This means that the process acquires the mutex in exclusive mode.
- Another process which also executes the same query needs to acquire the mutex
  but it’s being blocked by preceding process. The wait event is “cursor: pin S wait on X”.
 
--發生cursor: pin S wait on X原因
Frequent Hard Parses
If the frequency of Hard Parsing is extremely high, then contention can occur on this pin.
 
High Version Counts
When Version counts become excessive, a long chain of versions needs to
be examined and this can lead to contention on this event

可以看出MUTEX在某些情況下會代替LIBRARY CACHE PIN來保護LCO,如果要進行硬解析和ALTER 操作必須以X模式佔用
LCO來建立或者修改,如果此時要訪問這個LCO,也就是說軟解析的時候要訪問LCO,那麼將會出現這樣的等待,LCO實際
就是儲存我們的遊標資訊,軟解析共享的就是遊標共享,而不需要為每個語句生成一個LCO。

所以這個問題我覺得解決的方案還是想辦法降低硬解析:
1) Rewrite the application to use bind variables to lower the hard parse rate from 350/second
2) Enable the cursor_sharing = force
這裡我們可以看到硬解析 350每秒 代表一個ORACLE認為不能容忍的程度,而我們這裡是319,雖然沒有達到350但是也達到了一個
較高的值,出現這樣的情況是可能的。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7728585/viewspace-776530/,如需轉載,請註明出處,否則將追究法律責任。

相關文章