等待事件之Row Cache Lock

lhrbest發表於2017-05-23

等待事件之Row Cache Lock





定位的辦法:
--查詢row cache lock等待
select event,p1  from v$session where  event= 'row cache lock' and status='ACTIVE';
 
--查詢rowcache 名稱
select * from v$rowcache where cache# =p1;

名稱 P1 P2 P3 原因 處理
row cache lock 字典的Cache號,其值為V$ROWCACHE.CACHE# ModeMode Held requestMode Requested row cache lock等待事件是一個共享池相關的等待事件,是由於對字典緩衝的訪問造成的。每一個行緩衝佇列鎖都對應一個特定的資料字典物件,這被叫做佇列鎖型別,並可以在V$ROWCACHE檢視中找到。在AWR中需要檢視Dictionary Cache Stats部分用以確定問題。常見的原因有如下幾點:
① 序列沒有設定CACHE屬性,導致序列爭用
② 表空間不足引起 表空間的擴充套件速度跟不上表空間的使用速度會發生該等待事件
③ Shared Pool不足,需要增加共享池
④ 使用者密碼錯誤或給出了空密碼並且頻繁登入
根據P1值查詢V$ROWCACHE檢視(SELECT A.PARAMETER FROM V$ROWCACHE  A WHERE CACHE# =P1;),進而確定佇列型別。若為DC_SEQUENCES則增大序列的CACHE值;若為DC_USERS則可能是當使用者密碼錯誤或密碼為空時使用者頻繁登入導致,進而查詢審計來確定(SELECT * FROM DBA_AUDIT_TRAIL WHERE RETURNCODE IN (1017, 1005);1017代表密碼錯誤,1005代表空密碼);其它型別很可能是Shared Pool過小導致。

執行DDL必須請求一個行緩衝鎖(row cache lock)來鎖住資料字典(Data Dictionary)資訊。共享池(Shared Pool)包含來自資料字典的的行緩衝區,可以減少磁碟的I/O訪問,並允許對行進行加鎖。資料字典行鎖被叫做行緩衝佇列鎖(Row Cache Enqueue Locks)。這個佇列鎖結構從共享池中按需求分配,當這些請求在等待並超時就看到行緩衝佇列鎖。row cache lock等待事件是一個共享池相關的等待事件,是由於對字典緩衝的訪問造成的。每一個行緩衝佇列鎖都對應一個特定的資料字典物件這被叫做佇列鎖型別,並可以在V$ROWCACHE檢視中找到。


SELECT * FROM V$EVENT_NAME D WHERE D.NAME = 'row cache lock';

SELECT *

  FROM DBA_HIST_ROWCACHE_SUMMARY D

 WHERE D.SNAP_ID BETWEEN 1 AND 3014

   AND D.GETS > 0;

 

SELECT * FROM V$ROWCACHE WHERE CACHE# IN ('7', '10');

 

SELECT A.PARAMETER FROM V$ROWCACHE  A WHERE CACHE# =P1;

 

SELECT * FROM DBA_AUDIT_TRAIL d WHERE /*RETURNCODE IN (1017) AND*/ d.action_name='LOGON' AND D.returncode>0 AND D.returncode<>1017;



[root@rhel6lhr ~]# oerr ora 1005

01005, 00000, "null password given; logon denied"

// *Cause:

// *Action:

[root@rhel6lhr ~]# oerr ora 1017

01017, 00000, "invalid username/password; logon denied"

// *Cause:

// *Action:

[root@rhel6lhr ~]#

AWR中需要檢視Dictionary Cache Stats部分用以確定問題。

 




常見佇列鎖型別
行緩衝佇列鎖等待的調優基於每一個佇列鎖型別的行為,其中常見的有:
① DC_SEQUENCES:在使用序列的時候將發生該行緩衝佇列鎖。調優方式是檢查序列是否指定了緩衝選項並確定這個緩衝值可以承受預期的併發insert操作。Check for appropriate caching of sequences for the application requirements.
② DC_USED_EXTENTS和DC_FREE_EXTENTS:該行緩衝佇列鎖可能在空間管理碰到表空間分裂或者沒有足夠區大小時發生。調優方法是檢查表空間是否分裂了、區大小是否太小或者表空間是人工管理。
③ DC_TABLESPACES:該行緩衝佇列鎖會在分配新區是發生。如果區大小設定得過小,程式將經常申請新區,這將導致衝突。調優方法是快速地增加區的數量。Probably the most likely cause is the allocation of new extents. If extent sizes are set low then the application may constantly be requesting new extents and causing contention. Do you have objects with small extent sizes that are rapidly growing? (You may be able to spot these by looking for objects with large numbers of extents). Check the trace for insert/update activity, check the objects inserted into for number of extents.
④ DC_OBJECTS:該行緩衝佇列鎖會在重編譯物件的時候發生。當物件編譯時將申請一個排他鎖阻塞其他行為。通過檢查非法物件和依賴關係來調優。
⑤ DC_SEGMENTS:該行緩衝佇列鎖會在段分配的時候發生,觀察持有這個佇列鎖的會話在做什麼。This is likely to be down to segment allocation. Identify what the session holding the enqueue is doing and use errorstacks to diagnose. 
⑥ DC_USERS:Deadlock and resulting “WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!” can occur if a session issues a GRANT to a user, and that user is in the process of logging on to the database.dc_users是和使用者用錯誤密碼登陸有關,In 11g there is an intentional delay between allowing failed logon attempts to retry. For some specific application types this can cause a problem as the row cache entry is locked for the duration of the delay . This can lead to excessive row cache lock waits for DC_USERS for specific users / schemas.

After 3 successive failures a sleep delay is introduced starting at 3 seconds and extending to 10 seconds max. During each delay the user X row cache lock is held in exclusive mode preventing any concurrent logon attempt as user X (and preventing any other operation which would need the row cache lock for user X).
⑦ DB_ROLLBACK_SEGMENTS:This is due to rollback segment allocation. Just like dc_segments,identify what is holding the enqueue and also generate errorstacks. Remember that on a multi-node system (RAC) the holder may be on another node and so multiple systemstates from each node will be required.
⑧ DC_AWR_CONTROL:This enqueue is related to control of the Automatic Workload Repository. As such any operation manipulating the repository may hold this so look for processes blocking these. 
DC_SEQUENCES
For DC_SEQUENCES, consider caching sequences using the cache option.


DC_OBJECTS
Look for any object compilation activity which might require an exclusive lock, blocking other activities


DC_SEGMENTS
Contention here is most likely to be due to segment allocation.  Investigate what segments are being created at the time.


DC_USERS
This may occur if a session issues a GRANT to a user and that user is in the process of logging on to the database. Investigate why grants are being made while the users are active.


DC_TABLESPACES
The most likely cause is the allocation of new extents. If extent sizes are set low then the application may constantly be requesting new extents and causing contention. Do you have objects with small extent sizes that are rapidly growing? (You may be able to spot these by looking for objects with large numbers of extents). Check the trace for insert/update activity, check the objects inserted into for number of extents.






Resolving Issues Where 'Row Cache Lock' Waits are Occurring (文件 ID 1476670.1)

In this Document

Purpose
Troubleshooting Steps
  Problem Confirmation:
  Row Cache Lock
  Reducing Waits
  Known Issues
References


APPLIES TO:

Oracle Database - Standard Edition - Version 10.2.0.1 and later
Oracle Database - Enterprise Edition - Version 10.2.0.1 and later
Oracle Database - Personal Edition - Version 10.2.0.1 and later
Information in this document applies to any platform.

PURPOSE

Disclaimer: This note is written in the context that the Performance Guided Resolution tool was used and recommended this article. It may not make as much sense if read standalone or by other audiences.


TROUBLESHOOTING STEPS

 Brief Definition:

The shared pool contains a cache of rows from the data dictionary that helps reduce physical I/O on the data dictionary tables. The row cache lock is used primarily to serialize changes to the data dictionary and is waited on when a lock on a data dictionary cache is required. Waits on this event usually indicate some form of DDL occuring, or possibly recursive operations such as storage management and incrementing sequence numbers.

Problem Confirmation:

  • Significant wait for latch: row cache objects
  • Slow overall performance with row cache lock
  • High CPU usage

Row Cache Lock

When DDLs execute, they must acquire locks on the row cache in order to access and change the Data Dictionary information. Once the locks are taken then they can be allowed to modify individual rows in the data dictionary. 

Reducing Waits

1. The data dictionary resides in the shared pool. If the shared pool is not sized correctly then the data dictionary might not be fully cached. This should be handled automatically with the automatic shared memory tuning feature. The following documents provide more details: 

Document 257643.1 Oracle Database Automated SGA Memory Tuning
Document 270935.1 Shared pool sizing

 2. Find which cache is being waited for:

SQL> select p1text,p1,p2text,p2,p3text,p3 from v$session where event='row cache lock';

P1TEXT P1 P2TEXT P2 P3TEXT P3

cache id 8 mode 0 request 3

SQL> select parameter,count,gets,getmisses,modifications from v$rowcache where cache#=8;

PARAMETER COUNT GETS GETMISSES MODIFICATIONS

DC_SEQUENCES    869    76843 508432   4500

In this example the cache is the "DC_SEQUENCES" cache.

3. Take cache dependent actions:

DC_SEQUENCES
For DC_SEQUENCES, consider caching sequences using the cache option.

DC_OBJECTS
Look for any object compilation activity which might require an exclusive lock, blocking other activities

DC_SEGMENTS
Contention here is most likely to be due to segment allocation.  Investigate what segments are being created at the time.

DC_USERS
This may occur if a session issues a GRANT to a user and that user is in the process of logging on to the database. Investigate why grants are being made while the users are active.

DC_TABLESPACES
The most likely cause is the allocation of new extents. If extent sizes are set low then the application may constantly be requesting new extents and causing contention. Do you have objects with small extent sizes that are rapidly growing? (You may be able to spot these by looking for objects with large numbers of extents). Check the trace for insert/update activity, check the objects inserted into for number of extents.

4. For further information on row cache issues, review the following:

Document 278316.1 Troubleshooting "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!"

Measuring Success

Once you have applied the changes to resolve the issues you have found, compare the latest AWR to the AWR that led you here via Guided Resolution (that AWR becomes your baseline). Look at the percentage decrease total wait time for this event. If there are still issues, re- evaluate those and address them according to the specific symptom.


Known Issues

Document 1417373.1 Row Cache Latch Contention for DC_USERS While Using VPD
Document 2372926.8 "row cache objects" latch contention on DC_USERS row cache

Document 1466896.1 High Waits on 'Row Cache Lock' with Pipelined Functions After Upgrade to 11.2.0.3
Document 13502860.8 Bug 13502860 - "row cache lock" contention on SYS_PLSQL_xx objects using PIPELINED functions




REFERENCES


NOTE:34609.1 - WAITEVENT: "row cache lock" Reference Note



WAITEVENT: "row cache lock" Reference Note (文件 ID 34609.1)

"row cache lock" Reference Note

This is a reference note for the wait event "row cache lock" which includes the following subsections: See Note:61998.1 for an introduction to Wait Events.

Definition:

  • Versions: 7.3 - 12.1
    Documentation: 12.1 11.2 11.1 10.2 10.1 

  • This event is used to wait for a lock on a data dictionary cache specified by "cache id" (P1).

    If running Real Application Clusters (RAC) then LCK0 is signalled to get the row cache lock for the foreground waiting on this event. The LCK0 process will get the lock asynchronously. In exclusive mode the foreground process will try to get the lock.

Individual Waits:

  Parameters:

  • P1 = cache - ID of the dictionary cache
  • P2 = mode - Mode held
  • P3 = request - Mode requested
  • cache - ID of the dictionary cache
    Row cache lock we are waiting for. Note that the actual CACHE# values differ between Oracle versions. The cache can be found using this select - "PARAMETER" is the cache name:
    SELECT cache#, type, parameter 
      FROM v$rowcache     
     WHERE cache# = &P1
    ;

    In a RAC environment the row cache locks use global enqueues of type "Q[A-Z]" with the lock id being the hashed object name.


  • mode - Mode held
    The mode the lock is currently held in:
      KQRMNULL 0   null mode - not locked
      KQRMS    3   share mode
      KQRMX    5   exclusive mode KQRMFAIL 10  fail to acquire instance lock
  • request - Mode requested
    The mode the lock is requested in:
      KQRMNULL 0   null mode - not locked
      KQRMS    3   share mode
      KQRMX    5   exclusive mode KQRMFAIL 10  fail to acquire instance lock

  Wait Time:

In exclusive mode any process other than PMON will timeout after 8 hours (10000 waits of 3 seconds)

In RAC the foreground will wait 60 seconds for the LCK0 to get the lock, the foreground will wait in infinite loop until the lock has been granted (LCK0 will notify foreground).

In either case PMON will wait for only 5 seconds.

If a session times out when waiting for a row cache lock then it will report this to the alert log and tracefile with a message like:

WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK


  Finding Blockers:

Holders and requesters can be seen in view X$KQRFP for parent objects, and X$KQRFS for subordinates. 
eg: The following select will show all holders of parent row cache objects so can be used to help find the blocking session.
SELECT * FROM x$kqrfp WHERE kqrfpmod!=0;
(KQRFPSES is the address of the holding session V$SESSION.SADDR)


Systemwide Waits:

It is important to determine which cache is being waited for. The V$ROWCACHE view gives an overview of which caches are being used most, but the waits may not necessarily be on the most used cache. The V$ACTIVE_SESSION_HISTORY view can be used to get an idea of which cache ids (P1) are involved in waits.
  • If the issue is general across various caches (different cache ids) then the shared pool may need increasing in size to allow more dictionary information to be cached
  • If the issue is focused on a specific cache id then options typically depend on the cache involved - see the Troubleshootingsection below.

Reducing Waits / Wait times:

Options to reduce waits depends on the specific cache that has the contention. See the documents in the Troubleshooting section below for advice on various caches.

Troubleshooting

See the following documents for help troubleshooting issues relating to "row cache lock" waits
Document:1476670.1 Resolving Issues Where 'Row Cache Lock' Waits are Occurring
Document:278316.1 Troubleshooting "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!"

Known Issues / Bugs:

You can restrict the list below to issues likely to affect one of the following versions by clicking the relevant button: 
                


NB Prob Bug Fixed Description
- 17828499 12.1.0.1.4, 12.1.0.2, 12.2.0.1 Opening PDB hangs waiting for row cache lock on open UPGRADE
- 16921340 12.1.0.1.1, 12.1.0.2, 12.2.0.1 Non-CDB to PDB plugin hangs
II 13884774 12.1.0.2, 12.2.0.1 Deadlock from concurrent select/ALTER SUMMARY and ALTER TABLE if query_rewrite_enabled set to true/force - superseded
II 16994952 12.1.0.1 Unable to unschedule propagation due to CJQ self-deadlock / hang
II 14117976 11.2.0.3.BP14, 11.2.0.4, 12.1.0.1 Database hangs when executing multiple MV DDL's in parallel
III 13496395 11.2.0.4, 12.1.0.1 Hang / deadlock involving ACCOUNT_STATUS object for concurrent LOGON and ALTER USER
III 7715339 11.2.0.1 Logon failures causes "row cache lock" waits - Allow disable of logon delay
III 21153142 12.2.0.1 Row cache lock self deadlock accessing seed PDB
II 21091431 12.1.0.2.160419, 12.1.0.2.DBBP13, 12.2.0.1 row cache lock during trigger creation using editions
II 19907473 12.2.0.1 GEN0 process causing database hang
III 15850031 11.2.0.4, 12.1.0.2, 12.2.0.1 Rare instance hang: deadlock between 'row cache lock' and 'cursor: pin S wait for X'
II 13622515 11.2.0.4, 12.1.0.2, 12.2.0.1 library cache <-> row cache deadlock / hang when altering constraint with MV involved - superseded
II 13869467 12.1.0.2 Many waits for 'row cache lock' in RAC while creating many tables with constraints
II 13979519 11.2.0.3.BP11, 11.2.0.4, 12.1.0.1 Locks on dc_users kept too long
II 13916228 11.2.0.4, 12.1.0.1 Enabling/disabling constraints makes DML timeout in RAC - superseded
II 13502860 11.2.0.4, 12.1.0.1 "row cache lock" contention on SYS_PLSQL_xx objects using PIPELINED functions
- 13387978 11.2.0.4, 12.1.0.1 Sessions running TRUNCATE causing a deadlock, even if the constraints are disabled
- 12953743 11.2.0.3.BP13, 11.2.0.4, 12.1.0.1 Parallel CTAS of table with a Securefile is slower than parallel IAS / PIDL
I 12889054 11.2.0.2.BP16, 11.2.0.3.BP05, 11.2.0.4, 12.1.0.1 AWR snapshot hangs on dc_objects row_cache_lock held by ctas job
D - 12792862 11.2.0.4, 12.1.0.1 Performance of INSERT using binary xml is very slow due to "row cache lock" in RAC environment - superseded
II 12351027 11.2.0.4, 12.1.0.1 Redefinition causing deadlock between "row cache lock" and "library cache lock"
II 11693365 11.2.0.3, 12.1.0.1 Concurrent Drop table and Select on Reference constraint table hangs (deadlock)
I 10382754 11.2.0.2.BP17, 11.2.0.3, 12.1.0.1 Poor performance/ rowcache contention in 11g with partitioning due to invalidation of objects
IIII 10204505 11.2.0.3, 12.1.0.1 SGA autotune can cause row cache misses, library cache reloads and parsing
II 10126219 11.2.0.1.BP08, 11.2.0.2.BP02, 11.2.0.3, 12.1.0.1 Undetected deadlock 'library cache lock'/'row cache lock' with concurrent DDLs on partition tables.
II 9952554 11.2.0.2.8, 11.2.0.2.BP18, 11.2.0.3, 12.1.0.1 Undetected deadlock 'library cache lock'/'row cache lock' with a session modifying a constraint
II 9866045 11.2.0.3, 12.1.0.1 Long wait on 'wait for master scn' in LCK causing long row cache lock waits
III 9776608 11.2.0.2, 12.1.0.1 Hang from concurrent login to same account with a wrong password
II 9278979 11.2.0.2, 12.1.0.1 Instance hang / ORA-4021 with OPTIMIZER_USE_PENDING_STATISTICS = true
I 8268775 11.1.0.7.4, 11.2.0.1.2, 11.2.0.1.BP07, 11.2.0.2, 12.1.0.1 High US enqueue contention during a login storm or session failover
- 8364676 11.1.0.7.2, 11.2.0.1 row cache lock waits from background space preallocation
III 7529174 10.2.0.5, 11.1.0.7.3, 11.2.0.1 Deadlock / hang between SMON and foreground process
- 7416901 11.1.0.7.1, 11.2.0.1 Deadlock between QC and PQ slaves when CELL_PARTITION_LARGE_EXTENTS = ALWAYS
II 7313166 10.2.0.5, 11.2.0.1 Startup hang with self deadlock on dc_rollback_segments
II 6870994 10.2.0.5, 11.1.0.7.3, 11.2.0.1 High US enqueue / rowcache lock while trying to online a NEW undo segment
III 6027068 10.2.0.5, 11.2.0.1 Contention on ORA_TQ_BASE$ sequence
- 5756769 10.2.0.4.1, 10.2.0.5, 11.1.0.7, 11.2.0.1 Deadlock between Create MVIEW and DML
III 6143420 10.2.0.5, 11.1.0.6 Deadlock involving "ROW CACHE LOCK" on dc_users AND "CURSOR: PIN S WAIT ON X"
III 6004916 10.2.0.5, 11.1.0.6 Hang involving row cache enqueues in RAC (ORA-4021)
II 5883112 10.2.0.4, 11.1.0.6 False deadlock in RAC
II 5138741 10.2.0.4, 11.1.0.6 High waits on 'row cache lock' when using materialized views on RAC
II 4604972 11.1.0.6 Deadlock on dc_users by Concurrent Grant/Revoke
- 4579381 10.1.0.5, 10.2.0.2, 11.1.0.6 Deadlock on DC_USERS in RAC (ORA-4020)
- 4446011 9.2.0.8, 10.1.0.5, 10.2.0.2, 11.1.0.6 Hang with row cache lock deadlock from concurrent ALTER USER / TRUNCATE
II 4390868 10.1.0.5, 10.2.0.3, 11.1.0.6 Contention on DC_SEGMENTS due to small cache size on SYS.AUDSES$
- 4313246 9.2.0.8, 10.1.0.5, 10.2.0.2, 11.1.0.6 PLSQL execution can hold dc_users row cache lock leading to hang / deadlocks
I 4153150 9.2.0.8, 10.2.0.2, 11.1.0.6 Deadlock on dc_rollback_segments from concurrent parallel load and undo segment creation
I 6051177 10.2.0.4.1, 10.2.0.5 Hang / deadlock between coalesce and DBMS_STATS.gather_table_stats
- 5983020 10.2.0.4 MMON deadlock with user session executing ALTER USER
- 4275733 9.2.0.8, 10.1.0.5, 10.2.0.1 Deadlock between library cache lock and row cache lock from concurrent rename partition
I 5641198 10.2.0.1 Some waits may be longer than needed ("row cache lock") in RAC
- 4137000 10.1.0.5, 10.2.0.1 Concurrent SPLIT PARTITION can deadlock / hang
- 3627263 9.2.0.6, 10.1.0.4, 10.2.0.1 Deadlock / hang during RAC instance startup
II 3424721 9.2.0.6, 10.1.0.3, 10.2.0.1 Hang/deadlock from ALTER INDEX REBUILD on partition with concurrent SQL
- 2615271 9.2.0.6, 10.1.0.2 Deadlock from concurrent GRANT and logon
  • '*' indicates that an alert exists for that issue.
  • '+' indicates a particularly notable issue / bug.
  • See Note:1944526.1 for details of other symbols used

Related:

Document:1628089.1 AWR Report Interpretation Checklist for Diagnosing Database Performance Issues
Document:1359094.1 How to Use AWR Reports to Diagnose Database Performance Issues
Document:1274511.1 General SQL_TRACE / 10046 trace Gathering Examples


Troubleshooting: "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! " (文件 ID 278316.1)

故障排除:"WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! " (文件 ID 2016422.1)

文件內容

用途
排錯步驟
  什麼是 row cache enqueue 鎖(Row Cache Enqueue Lock)?
  "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" 警告資訊是什麼意思?
  "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" 可能的原因
  SGA收縮(shrink)/調整大小的操作(resize)
  row cache enqueue 型別
  DC_TABLESPACES
  DC_SEQUENCES
  DC_USERS
  DC_OBJECT_IDS
  DC_SEGMENTS
  DC_ROLLBACK_SEGMENTS
  DC_TABLE_SCNS
  DC_AWR_CONTROL
  我可以收集哪些資訊,以確定原因?
  Systemstate dump
  AWR,ADDM 和 ASH 報告
  如何分析收集到的診斷資訊?
  Systemstate dump
  示例1:
  示例2:
  AWR 報告
  10g 以前的版本可能存在的問題
  其他問題疑難解答
參考


適用於:

Oracle Database - Personal Edition - 版本 8.0.6.0 和更高版本
Oracle Database - Enterprise Edition - 版本 8.0.6.0 和更高版本
Oracle Database - Standard Edition - 版本 8.0.6.0 和更高版本
本文件所含資訊適用於所有平臺

用途

本文件的目的是幫助排查原因"WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! "

排錯步驟

什麼是 row cache enqueue 鎖(Row Cache Enqueue Lock)?

行快取(Row Cache)或資料字典快取(Data Dictionary Cache)是儲存資料字典資訊的共享池的記憶體區域。row cache 儲存資料時並不是以資料塊的形式,而是以行的形式。row cache enqueue 鎖是在資料字典行的鎖。此 enqueue 是關於特定資料字典物件的。這就是所謂的 enqueue 型別,可以在檢視 V$rowcache 中找到。

對於每個版本 row cache 型別的列表,請參閱:

Document 468334.1  How to Match a Row Cache Object Child Latch to its Row Cache

"WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" 警告資訊是什麼意思?

當我們試圖獲得 row cache 鎖,這種等待事件將被使用。
當 row cache 衝突發生時,如果不能在一個預定的時間週期內得到 enqueue,將在 USER_DUMP_DEST 或 background_dump_dest 目錄下生成一個跟蹤檔案,這取決於是使用者還是後臺程式建立的跟蹤檔案。alert.log 通常會相應的更新警告訊息和跟蹤檔案的位置。
資料庫檢測到核心資源被持有太久並通知管理員,從而讓這種情況可以得到解決。這也可能伴隨著資料庫掛起或變慢。
alert.log 的訊息和生成的跟蹤檔案趨向於包含訊息:

> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! <<<

如果不能立即獲取 rowcache entry 鎖,那麼進入一個迴圈,先釋放 row cache 物件閂鎖,等待(等待上述等待事件),重新獲得閂鎖,然後再次嘗試獲取 rowcache 鎖。在單例項模式,會重複 1000次直到程式報錯“WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK”。在 RAC 環境會一直重複,直到不能獲得例項鎖或者被中斷。

Systemstate dump 可以提供一些有用的資訊診斷爭用的原因。

請注意:The "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" 當達到閾值會引發這個訊息,因此,如果未達到閾值它不會被引發。這意味著,不太嚴重的問題,即使具有相同的原因,也可以不輸出該訊息。


"WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" 可能的原因

SGA收縮(shrink)/調整大小的操作(resize)

如果 SGA 動態地改變大小,需要持有各種 latches 來避免其它程式同時操作,直到操作完成。如果調整大小需要一段時間,或者是經常發生,你會看到"WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!"的發生。定位這種情況的方法是,有很多'SGA: allocation forcing component growth'等待事件,或 AWR 的 TOP 列表有類似等待,以及阻塞等待"WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" 的會話在等待'SGA: allocation forcing component growth'(或類似)。有一些可用的程式碼修復,請參閱:

Document 7189722.8 Bug 7189722 - Frequent grow/shrink SGA resize operations 
Document 9267837.8 Bug 9267837 - Auto-SGA policy may see larger resizes than needed

row cache enqueue 型別

對於每一個 enqueue 型別,都有對應的一些操作會需要獲取這類 enqueue。佇列的型別,可能給出由於操作可能導致的問題的指示。一些常見的原因如下:

DC_TABLESPACES

最可能的原因是新 extent 的分配。如果 extent 大小設定過小,那麼應用程式可能會不斷地要求新的 extent,這可能導致爭用。你有很小的 extent 尺寸,正在迅速增長的物件嗎?(通過查詢具有大量 extents 的物件可以定位它們)。檢查 insert/update 活動的 trace,查詢那些就有很多 extents 的物件。

DC_SEQUENCES

檢查應用程式用到的 sequence 的 cache 的大小:

Document 853652.1 RAC and Sequences
Document 395314.1 RAC Hangs due to small cache size on SYS.AUDSES$ - fixed in 10.2.0.3
Document 6027068.8 Bug 6027068 - Contention on ORA_TQ_BASE sequence -fixed in 10.2.0.5 and 11.2.0.1
DC_USERS

一個會話正在對一個使用者執行 GRANT,與此同時此使用者正在登入到資料庫中,此時可能會發生死鎖或導致"WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" 。

Document 4604972.8 Bug 4604972 - Deadlock on dc_users by Concurrent Grant/Revoke - fixed in 11.1.0.6
Document 6143420.8 Bug 6143420 - Deadlock involving "ROW CACHE LOCK" on dc_users AND "CURSOR: PIN S WAIT ON X"- fixed in 10.2.0.5 and 11.1.0.6  DC_OBJECTS
Document 12772404.8 Bug 12772404 - Significant "row cache objects" latch contention when using VPD
DC_OBJECT_IDS
Document 11693365.8 Bug 11693365 - Concurrent Drop table and Select on Reference constraint table hangs(deadlock) - fixed in 12.1
DC_SEGMENTS

這很可能是 segment 的分配導致的。確定持有鎖的使用者正在做什麼並使用 errorstacks 進行診斷。

DC_ROLLBACK_SEGMENTS

這可能是由於 rollback 段的分配導致的。正如 dc_segments,確定誰持有鎖並收集 errorstack 來進行診斷。請記住,在多節點系統(RAC)上,持有者可能在另一節點上,因此需要所有節點的 systemstate。

DC_TABLE_SCNS
Document 5756769.8 Bug 5756769 - Deadlock between Create MVIEW and DML - fixed in 10.2.0.5 ,11.1.07 and 11.2.0.1
DC_AWR_CONTROL

此 enqueue 關係到 AWR(Automatic Workload Repository)的控制權。任何操縱 AWR 資料庫的操作將持有它. 要分析這個問題,需要查詢是那些程式阻塞了它們。

  RAC 相關的 Bugs

Document 6004916.8 Bug 6004916 - Hang involving row cache enqueues in RAC (ORA-4021) - fixed in 102.0.5 and 11.1.0.6
Document 8666117.8 Bug 8666117 - High row cache latch contention in RAC - fixed in 11.2.0.2 and 12.1
Document 9866045.8 Bug 9866045 - Long wait on 'wait for master scn' in LCK causing long row cache lock waits - fixed in 12.1


我可以收集哪些資訊,以確定原因?

Systemstate dump

當問題發生時,錯誤會記入 alert.log,並自動產生一個 systemstate dump 檔案。

Wed Sep 21 13:39:19 2011
> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=37
System State dumped to trace file /oracle/diag/rdbms/..../.trc


AWR,ADDM 和 ASH 報告

收集兩份 AWR 報告,一份有問題時間段的,另一個是沒有問題時間段的,因為這些可以幫助我們理解問題發生時資料庫的狀況 AWR,ADDM,ASH 報告,可以相互取長補短,從而更完整地理解整個問題。

取決於 AWR 快照生成的時間間隔,收集最小時間間隔的報告。預設的快照是一個小時的時間間隔。

SQL>@$ORACLE_HOME/rdbms/admin/awrrpt.sql
SQL>@$ORACLE_HOME/rdbms/admin/addmrpt.sql
SQL>@$ORACLE_HOME/rdbms/admin/ashrpt.sql


鑑於分析 systemstate 是一件很複雜的事情,您可以建立一個服務請求,並上傳 alert.log,systemstate dump,以及問題發生前和問題發生時的 AWR 報告請 Oracle 技術支援來分析。

如何分析收集到的診斷資訊?

Systemstate dump

通常情況下, row cache enqueue 是一系列事件的一部分,阻塞了申請 row cache enqueue 的程式的程式很可能被另一個程式阻塞。Row cache enqueue 經常是問題的表象。

Systemstate dump 可以幫助查詢申請的是哪一個 row cache ,並可能有助於發現阻塞程式。

示例1:
Oracle process number: 77
Unix process pid: 10846, image: oracle@cpdb4532

*** 2011-05-13 08:08:58.775
*** SERVICE NAME:(ALFCMR_SERVICE) 2011-05-13 08:08:58.775
*** SESSION ID:(1076.796) 2011-05-13 08:08:58.775
> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! <<<
row cache enqueue: session: 0x1df57ade8, mode: N, request: S


trace 的標題顯示下列內容:

  • 等待 row cache enqueue 鎖的 Oracle 程式號(PID)(在這個案例,程式 77)。
  • 正在申請的 row cache enqueue 的模式(請求:S)。

因此,在上述例子中,程式 77 是在請求以共享模式獲得 row cache(請求:S)。

Systemstate 包含資料庫中每一個程式的狀態資訊,因此可以在 systemstate 中查詢這個程式:

PROCESS 77
----------------------------------------
.
.
----------------------------------------
SO: 0x1cdf11958, type: 50, owner: 0x17d198288, flag: INIT/-/-/0x00
row cache enqueue: count=1 session=0x1df57ade8 object=0x1dc9a5d30, request=S
savepoint=0x87b70d
row cache parent object: address=0x1dc9a5d30 cid=7(dc_users)
.
.


從上面我們看到,程式 77 請求共享模式獲得 row cache dc_users。
程式 77 處於等待狀態,意味著被其它程式阻塞,我們現在需要檢查 systemstate 判斷誰持有資源並阻塞了這個程式。

搜尋引用的物件,在這個例子中,是 object=0x1dc9a5d30。
這樣做完以後,我們發現,程式 218 正以獨佔模式持有這個物件:

PROCESS 218:
----------------------------------------
.
.
SO: 0x1cdf118f8, type: 50, owner: 0x1ceb0f178, flag: INIT/-/-/0x00
row cache enqueue: count=1 session=0x1da54cf68 object=0x1dc9a5d30, request=X
savepoint=0x11e
row cache parent object: address=0x1dc9a5d30 cid=7(dc_users)


獨佔模式的請求,將會一直阻塞共享模式的請求,直到該程式獨佔模式的請求被滿足並稍後釋放了這個資源。因此,這將阻止其他共享模式請求。請注意,這是請求獨佔而不是獨佔持有,所以這個請求也一定被阻塞了。檢視其他程式中,我們看到程式 164 在以共享模式(mode=s)持有這個物件。

PROCESS 164:
----------------------------------------
.
.
O/S info: user: u1m, term: , ospid: 1234, machine: cpc44711
program:
last wait for 'SQL*Net message from client' blocking sess=0x(nil) seq=36289 wait_time=6943 seconds since wait started=2539
driver id=54435000, #bytes=1, =0
.
.
SO: 0x1cdf11418, type: 50, owner: 0x1ccc26120, flag: INIT/-/-/0x00
row cache enqueue: count=2 session=0x1df578318 object=0x1dc9a5d30, mode=S
savepoint=0xb1bd8e
row cache parent object: address=0x1dc9a5d30 cid=7(dc_users)
hash=fc968070 typ=11 transaction=(nil) flags=00000002
own=0x1dc9a5e00[0x1cdf11448,0x1cdf11448] wat=0x1dc9a5e10[0x1cdf11928,0x17d5192e0] mode=S


因此,程式 164 以共享模式持有 row cache enqueue(mode= S),從而防止了程式 218 以獨佔模式獲得該 row cache enqueue。此外,我們看到,程式 164 在 ON CPU(systemstate 顯示最後一個等待是'SQL*Net message from client',而不是等待'SQL*Net message from client')。為了進一步診斷,技術支援需要檢查堆疊呼叫,以確定為什麼這個程式在 ON CPU 並持有該佇列這麼久(從開始已經等待了2539秒)。

示例2:


在這個例子中,程式 18(MMON)等待以共享模式獲得型別為 dc_awr_control 的 row cache。

Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /opt/oracle10/product/10.2.0
System name: SunOS
Node name: saecopt51
Release: 5.10
Version: Generic_144488-04
Machine: sun4v
Instance name: PORT_V16
Redo thread mounted by this instance: 1
Oracle process number: 18
Unix process pid: 6196, image: oracle@sae (MMON)
.
.

PROCESS 18:
----------------------------------------
.
.
last wait for 'ksdxexeotherwait' wait_time=0.000013 sec, seconds since wait started=6
.
.
SO: 39bf1f0e8, type: 50, owner: 3980783a0, flag: INIT/-/-/0x00
row cache enqueue: count=1 session=3be37ea80 object=39a79f090, request=S
savepoint=0x41f0ae
row cache parent object: address=39a79f090 cid=22(dc_awr_control)
hash=6f60197e typ=9 transaction=3bc39f560 flags=0000002a
own=39a79f160[39bf1f178,39bf1f178] wat=39a79f170[39bf1f118,39bf1f118] mode=X
.
.


物件(object=39a79f090)的 row cache lock 被程式 269 以獨佔模式(mode=x)持有。程式在等待'SGA: allocation forcing component growth'。

PROCESS 269:
----------------------------------------
.
.

waiting for 'SGA: allocation forcing component growth' wait_time=0, seconds since wait started=3

.
.
SO: 39bf1f148, type: 50, owner: 3bc39f560, flag: INIT/-/-/0x00
row cache enqueue: count=1 session=3be1b7c98 object=39a79f090, mode=X
savepoint=0x41efe8
row cache parent object: address=39a79f090 cid=22(dc_awr_control)
hash=6f60197e typ=9 transaction=3bc39f560 flags=0000002a
own=39a79f160[39bf1f178,39bf1f178] wat=39a79f170[39bf1f118,39bf1f118] mode=X
.
.


因此根本原因就是 SGA 的大小調整,等待 row cache 是次要結果。
我們使用該期間的 AWR 報告看相關資訊:

AWR 報告

Top 5 Timed Events                                         Avg %Total               
~~~~~~~~~~~~~~~~~~                                        wait   Call               
Event                                 Waits    Time (s)   (ms)   Time Wait Class    
------------------------------ ------------ ----------- ------ ------ ----------    
SGA: allocation forcing compon   42,067,317      38,469      1    7.6      Other    
CPU time                                          2,796           0.6               
db file sequential read             132,906         929      7    0.2   User I/O    
latch free                        4,282,858         704      0    0.1      Other    
log file switch (checkpoint in          904         560    620    0.1 Configurat    
          -------------------------------------------------------------       


我們可以清楚地看到,在 Top 5 等待事件中,整個系統中有針對此事件的一個顯著等待;並且'SGA: allocation forcing component growth' 是這一時間點的一個主要問題。"WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" 訊息的根本原因就是記憶體調整活動,TOP 5 的等待事件甚至不顯示等待“row cache"症狀。

注意:
如果調整記憶體大小沒有那麼嚴重,有些時候沒有"WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!"訊息。 - 主要是因為沒有達到在此之前提到的閾值。但是,您可能會看到其他等待事件。在以下文獻中概述了一個常見的例子:

Document 742599.1 High 'cursor: pin S wait on X' and/or 'library cache lock' Waits Generated by Frequent Shared Pool/Buffer Cache Resize Activity

對於頻繁的記憶體調整,有幾個潛在的可用修復,請參閱:

Document 7189722.8 Bug 7189722 - Frequent grow/shrink SGA resize operations
Document 9267837.8 Bug 9267837 - Auto-SGA policy may see larger resizes than needed

10g 以前的版本可能存在的問題

10g 之前的版本,檢測 row cache 級別死鎖的方法有限。為了儘量減少發生死鎖的可能性,可能的解決方法:

  • 設定 TIMED_STATISTICS=FALSE
  • 設定 _row_cache_cursors=20 以上(預設值 10)
  • 不要做任何 tracing
Document 30802.1 Init.ora Parameter "ROW_CACHE_CURSORS" Reference Note

其他問題疑難解答

對於其他效能問題的故障排除,請參閱:

Document 1377446.1 Troubleshooting Performance Issues



Bug 12772404 - Significant "row cache objects" latch contention when using VPD (Doc ID 12772404.8)

參考

BUG:11693365 - GETTING ERROR 'WAITED TOOL LONG FOR ROW CACHE ENQUEUE LOCK'


BUG:5756769 - ROW CACHE DEADLOCK "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!"
NOTE:11693365.8 - Bug 11693365 - Concurrent Drop table and Select on Reference constraint table hangs (deadlock)
NOTE:1377446.1 - * Troubleshooting Performance Issues
NOTE:30802.1 - Init.ora Parameter "ROW_CACHE_CURSORS" Reference Note
NOTE:395314.1 - RAC Hangs due to small cache size on SYS.AUDSES$
NOTE:4604972.8 - Bug 4604972 - Deadlock on dc_users by Concurrent Grant/Revoke
NOTE:468334.1 - How to Match a Row Cache Object Child Latch to its Row Cache
NOTE:5756769.8 - Bug 5756769 - Deadlock between Create MVIEW and DML
NOTE:6004916.8 - Bug 6004916 - Hang involving row cache enqueues in RAC (ORA-4021)
NOTE:6027068.8 - Bug 6027068 - Contention on ORA_TQ_BASE$ sequence
NOTE:6143420.8 - Bug 6143420 - Deadlock involving "ROW CACHE LOCK" on dc_users AND "CURSOR: PIN S WAIT ON X"
NOTE:742599.1 - High 'Cursor: Pin S Wait On X', 'Library Cache Lock' And "Latch: Shared Pool" Waits due to Shared Pool/Buffer Cache Resize Activity
NOTE:853652.1 - RAC and Sequences
NOTE:8666117.8 - Bug 8666117 - High row cache latch contention in RAC
NOTE:9866045.8 - Bug 9866045 - Long wait on 'wait for master scn' in LCK causing long row cache lock waits



BUG:8666117 - LCK0 PROCESS STUCK AT WAITING FOR "LATCH: ROW CACHE OBJECTS"




一次awr報告分析(密碼錯誤引發sql執行時間過長) 
預備知識:
1)row cache lock事件  
--記憶體share pool分為library cache,dictionary cache;而row cache lock物件分佈在dictionary cache中,是由於對於字典緩衝的訪問造成的;  
--這類屬於latch類的資源競爭,相當耗CPU,如果併發量大的話,容易down機
如果發現這個等待十分高,一般來說可能由於2種原因,一是共享池太小了,需要增加共享池,另外一種情況是SQL分析過於頻繁,對於共享池的併發訪問量過大。對於任何一種情況,絕大多數情況下加大共享池會有助於降低該等待,不過加大共享池的時候也要注意,並不一定所有的情況下增加共享池都會有明顯的效果。特別是對於第二種情況,精確的分析十分重要。另外進一步分析,弄清楚哪些ROW CACHE的等待最為嚴重,有助於解決問題。
row cache lock事件的調優基於每一個佇列鎖型別的行為,常見的佇列鎖型別有: 
-------DC_SEQUENCES:在使用序列的時候將發生該行緩衝佇列鎖。調優方式是檢查序列是否指定了緩衝選項並確定這個緩衝值可以承受預期的併發insert操作。
Check for appropriate caching of sequences for the application requirements.
-------DC_USED_EXTENTS和DC_FREE_EXTENTS:該行緩衝佇列鎖可能在空間管理碰到表空間分裂或者沒有足夠區大小時發生。調優方法是檢查表空間是否分裂了、區大小是否太小或者表空間是人工管理。 
-------DC_TABLESPACES:該行緩衝佇列鎖會在分配新區是發生。如果區大小設定得過小,程式將經常申請新區,這將導致衝突。調優方法是快速地增加區的數量。
Probably the most likely cause is the allocation of new extents. If extent sizes are set low then the application may constantly be requesting new extents and causing contention. Do you have objects with small extent sizes that are rapidly growing? (You may be able to spot these by looking for objects with large numbers of extents). Check the trace for insert/update activity, check the objects inserted into for number of extents. 
-------DC_OBJECTS:該行緩衝佇列鎖會在重編譯物件的時候發生。當物件編譯時將申請一個排他鎖阻塞其他行為。通過檢查非法物件和依賴關係來調優。 
-------DC_SEGMENTS:該行緩衝佇列鎖會在段分配的時候發生,觀察持有這個佇列鎖的會話在做什麼。
This is likely to be down to segment allocation. Identify what the session holding the enqueue is doing and use errorstacks to diagnose. 
-------DC_USERS:Deadlock and resulting “WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!” can occur if a session issues a GRANT to a user, and that user is in the process of logging on to the database.
-------DB_ROLLBACK_SEGMENTS:This is due to rollback segment allocation. Just like dc_segments,identify what is holding the enqueue and also generate errorstacks. Remember that on a multi-node system (RAC) the holder may be on another node and so multiple systemstates from each node will be required.
-------DC_AWR_CONTROL:This enqueue is related to control of the Automatic Workload Repository. As such any operation manipulating the repository may hold this so look for processes blocking these. 
2)相關檢視欄位說明:
ROW CACHE LOCK基礎說明
    P1 – Cache Id
    P2 – Mode Held
    P3 – Mode Requested
mode 和REQUEST的取值:
    KQRMNULL 0 null mode – not locked
    KQRMS 3 share mode
    KQRMX 5 exclusive mode
    KQRMFAIL 10 fail to acquire instance lock
3)SQL查詢
-----查詢row cache lock等待
    select *  from v$session_wait where wait_class = 'row cache lock'; 
    查出p1的值
-----查詢rowcache 名稱
    根據p1的值來查詢
    select * from v$rowcache where cache# = &p1;
其他:
select event,p1 from v$session a where a.username is not null and a.status='ACTIVE';
4)dba_hist_active_sess_history檢視
dba_hist_active_sess_history檢視記錄了記憶體中活動會話的歷史資訊,而動態效能檢視V$ACTIVE_SESSION_HISTORY則記錄了當前活動會話資訊。 通過dba_hist_active_sess_history檢視關聯v$sqlarea和DBA_HIST_SNAPSHOT可以跟蹤到某個時間段內的sql。當然能跟蹤到的sql多少取決於v$sqlarea,畢竟只有還保留在v$sqlarea中的sql才能跟蹤到。

AWR報告分析
資料庫出現狀況是8月30號下午3點到4點,於是拉取了兩份AWR報告,一份是8月30號下午3點到4點,另一份是31號下午3點到4點,將兩份報告比對著看,不同的地方應該就是有問題的地方。
1)發現top 5的前臺事件中,排第一的不是DB CPU。預設正常情況下,DB CPU會排第一。

2)繼續看Wait Events Statistics ,發現排第一也不是DB CPU,而是connection management call elapsed time,看名字感覺和資料庫連線有關。

3)繼續比對,發現dc_users的Pct Miss比較大,超過了20%

4)檢視可疑的sql語句:

標紅的sql執行速度很慢,有些異常。
SQL> select to_char(sample_time, 'YY-MM-DD HH24:MI:SS') sample_time,
  2         instance_number,
  3         sql_id,
  4         P1,
  5         event,
  6         wait_class
  7    from dba_hist_active_sess_history
  8   where sample_time between
  9         to_date('16-08-29 13:00:00', 'YY-MM-DD HH24:MI:SS') and
10         to_date('16-08-29 13:30:00', 'YY-MM-DD HH24:MI:SS')
11     and sql_id in ('bvtu633rnwrwv',
12                    '4a6uhr508t0p6',
13                    'fpm2zazkfqhy6',
14                    'bhrcaykh5tzsw',
15                    '9fxv1px768bd5')
16   order by 1
17  ;
SAMPLE_TIME       INSTANCE_NUMBER SQL_ID                P1 EVENT                 WAIT_CLASS           
----------------- --------------- ------------- ---------- --------------------- ---------------------
16-08-29 13:00:59               1 4a6uhr508t0p6         10 row cache lock        Concurrency
16-08-29 13:01:09               1 4a6uhr508t0p6         10 row cache lock        Concurrency
16-08-29 13:01:19               1 4a6uhr508t0p6         10 row cache lock        Concurrency
16-08-29 13:01:29               1 4a6uhr508t0p6          7 row cache lock        Concurrency
16-08-29 13:01:39               1 bvtu633rnwrwv         10 row cache lock        Concurrency
16-08-29 13:01:49               1 bvtu633rnwrwv         10 row cache lock        Concurrency
16-08-29 13:01:59               1 bvtu633rnwrwv         10 row cache lock        Concurrency
16-08-29 13:02:10               1 bvtu633rnwrwv          7 row cache lock        Concurrency
16-08-29 13:02:20               1 fpm2zazkfqhy6         10 row cache lock        Concurrency
16-08-29 13:02:30               1 fpm2zazkfqhy6         10 row cache lock        Concurrency
16-08-29 13:02:40               1 fpm2zazkfqhy6         10 row cache lock        Concurrency
16-08-29 13:02:41               2 bvtu633rnwrwv          7 row cache lock        Concurrency
16-08-29 13:02:50               1 fpm2zazkfqhy6          7 row cache lock        Concurrency
16-08-29 13:04:31               1 bhrcaykh5tzsw         10 row cache lock        Concurrency
16-08-29 13:04:41               1 bhrcaykh5tzsw         10 row cache lock        Concurrency
16-08-29 13:04:51               1 bhrcaykh5tzsw          7 row cache lock        Concurrency
16-08-29 13:06:32               1 9fxv1px768bd5         10 row cache lock        Concurrency
16-08-29 13:06:42               1 9fxv1px768bd5         10 row cache lock        Concurrency
18 rows selected
根據上面的sql,發現sql導致的等待事件是row cache lock。再根據P1=7或者10,查出等待事件是發生在具體哪個類別上
SQL> select cache#,parameter from v$rowcache where cache# in ('7','10');
    CACHE# PARAMETER
---------- --------------------------------
        10 dc_users
         7 dc_users
         7 dc_users
         7 dc_users
根據上面的sql,發現引數類別是dc_users,網上說dc_users是和使用者用錯誤密碼登陸有關:
In 11g there is an intentional delay between allowing failed logon attempts to retry. For some specific application types this can cause a problem as the row cache entry is locked for the duration of the delay . This can lead to excessive row cache lock waits for DC_USERS for specific users / schemas.

After 3 successive failures a sleep delay is introduced starting at 3 seconds and extending to 10 seconds max. During each delay the user X row cache lock is held in exclusive mode preventing any concurrent logon attempt as user X (and preventing any other operation which would need the row cache lock for user X).
驗證一下,看是否在發生row cache lock期間,有使用者登陸失敗。
SQL> select username,
  2         userhost,
  3         to_char(timestamp, 'YY-MM-DD HH24:MI:SS') timestamp,
  4         action_name
  5    from dba_audit_trail
  6   where action_name = 'LOGON'
  7     and priv_used is null
  8     and timestamp between
  9         to_date('16-08-29 13:00:00', 'YY-MM-DD HH24:MI:SS') and
10         to_date('16-08-29 13:30:00', 'YY-MM-DD HH24:MI:SS'); 
USERNAME                       USERHOST              TIMESTAMP         ACTION_NAME
------------------------------ --------------------- ----------------- ----------------------------
MAPP_PLATFORM                  IDC-APP-02            16-08-29 13:03:51 LOGON
MAPP_PLATFORM                  IDC-APP-01            16-08-29 13:03:41 LOGON
MAPP_PLATFORM                  IDC-APP-02            16-08-29 13:03:31 LOGON
MAPP_PLATFORM                  IDC-APP-01            16-08-29 13:03:21 LOGON
MAPP_PLATFORM                  IDC-APP-02            16-08-29 13:03:11 LOGON
MAPP_PLATFORM                  IDC-APP-01            16-08-29 13:03:01 LOGON
MAPP_PLATFORM                  IDC-APP-02            16-08-29 13:02:51 LOGON
MAPP_PLATFORM                  IDC-APP-01            16-08-29 13:02:41 LOGON
MAPP_PLATFORM                  IDC-APP-02            16-08-29 13:02:31 LOGON
。。。。
發現使用者MAPP_PLATFORM從8.14到8.31,一直以錯誤密碼嘗試登陸資料庫,驗證了以錯誤密碼登陸會導致row cache lock同時sql執行慢的現象發生。
 進一步查詢登入審計資訊:
select * from
(
select os_username,userhost,terminal,username,count(*) failures
    from dba_audit_trail
   where returncode = 1017 and timestamp between to_date ('2016-8-29 11:30:00','yyyy-mm-dd hh24:mi:ss') and to_date ('2016-8-29 13:30:00','yyyy-mm-dd hh24:mi:ss')
   group by os_username,userhost,username,terminal
   order by 5 desc );
returncode = 1017 這裡的1017就是oracle內部定義的錯誤返回程式碼值。

實際上不管使用者提供的密碼是否正確,Oracle都會為新的connection分配一個shadow process,該服務程式為了進一步驗證使用者資訊都不得不獲取少量資源;如果以此為契機,即便在無法正常登入的情況下依然能在短期內造成例項僵死。








About Me

...............................................................................................................................

● 本文作者:小麥苗,只專注於資料庫的技術,更注重技術的運用

● 本文在itpub(http://blog.itpub.net/26736162)、部落格園(http://www.cnblogs.com/lhrbest)和個人微信公眾號(xiaomaimiaolhr)上有同步更新

● 本文itpub地址:http://blog.itpub.net/26736162/abstract/1/

● 本文部落格園地址:http://www.cnblogs.com/lhrbest

● 本文pdf版及小麥苗雲盤地址:http://blog.itpub.net/26736162/viewspace-1624453/

● 資料庫筆試面試題庫及解答:http://blog.itpub.net/26736162/viewspace-2134706/

● QQ群:230161599     微信群:私聊

● 聯絡我請加QQ好友(646634621),註明新增緣由

● 於 2017-05-09 09:00 ~ 2017-05-30 22:00 在魔都完成

● 文章內容來源於小麥苗的學習筆記,部分整理自網路,若有侵權或不當之處還請諒解

● 版權所有,歡迎分享本文,轉載請保留出處

...............................................................................................................................

拿起手機使用微信客戶端掃描下邊的左邊圖片來關注小麥苗的微信公眾號:xiaomaimiaolhr,掃描右邊的二維碼加入小麥苗的QQ群,學習最實用的資料庫技術。

等待事件之Row Cache Lock
DBA筆試面試講解
歡迎與我聯絡

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/26736162/viewspace-2139754/,如需轉載,請註明出處,否則將追究法律責任。

相關文章