oracle dead lock與效能

fufuh2o發表於2010-08-24


ORA-00060: Deadlock detected

 


dead lock一般 無法避免 a與b 兩個事務堵塞 有dead oracle 會自動回退 但只是回退其中一個事務與dead lock相關的語句
不會回退整個transaction,但oracle 認為dead lock不常出現因此在出現dead lock時會產生一個trace檔案
通常意義上一般是因為 foreign未加index才會出現,或者使用bitmap index高併發訪問(因為bitmap lock的不是一行 而是一組資料,所以oltp中bitmap不常用)

foreign 引發dead lock

 

兩種情況會lock子表
1.更新父表primary key,子表沒foreign沒index ,lock 整個子表
2.如果刪除了父表中的一行,整個子表也會被鎖住(由於外來鍵上沒有索引)。

drop table t2(a int);
create table t2(a int);
create table t2( a  int primary key);
create table t3( b  int references t2);

insert into t2 values ( 1 );
insert into t2 values ( 2 );
commit;
 

SQL> update t2 set a=3 where a=2;

1 row updated.

SQL> update t3 set b=2;(hang  se 1)
* 會話2更新表T3 此時hang住,因為t3已經被lock了 lmode 3 - row-X (SX)
SQL> update t3 set b=2;(hang se 2)
*此時會話1 更新T3 一樣被堵塞

此時就出現了資源的相互爭用

se2的session報出dead lock

ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource

BITMAP(lock粒度大了)
點陣圖索引對於相異基數(distinct cardinality)低的資料最合適(另外點陣圖索引可以儲存null條)
drop table t;
create table t (a int ,b varchar2(10));

insert into t values(1,'a');
insert into t values(1,'b');
insert into t values(1,'c');
insert into t values(2,'d');
insert into t values(2,'e');
insert into t values(2,'f');

 


SQL> select * from t;

         A B
---------- ----------
         1 a
         1 b
         1 c
         2 d
         2 e
         2 f

 


create bitmap index bi_t on t(a);

 


SQL> update t set a=3 where a=1 and b='a';(se 1)


update t set a=4 where a=2 and b='e'; se 2

SQL> update t set a=4 where a=2 and b='f';(se 1 hang)

SQL> update t set a=3 where a=1 and b='b';(se 2 hang)

SQL> update t set a=4 where a=2 and b='f';
update t set a=4 where a=2 and b='f'
       *
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource

 

Mon Jun 28 21:06:24 2010
ORA-00060: Deadlock detected. More info in file /home/oracle/diag/rdbms/xh/xh/trace/xh_ora_3649.trc.

 

 

 

 

 

 


檢視哪些表 foreign未加index

column columns format a30 word_wrapped
column tablename format a15 word_wrapped
column constraint_name format a15 word_wrapped
 select table_name, constraint_name,
 cname1 || nvl2(cname2,','||cname2,null) ||
 nvl2(cname3,','||cname3,null) || nvl2(cname4,','||cname4,null) ||
 nvl2(cname5,','||cname5,null) || nvl2(cname6,','||cname6,null) ||
 nvl2(cname7,','||cname7,null) || nvl2(cname8,','||cname8,null)
 columns
 from ( select b.table_name,
 b.constraint_name,
 max(decode( position, 1, column_name, null )) cname1,
 max(decode( position, 2, column_name, null )) cname2,
max(decode( position, 3, column_name, null )) cname3,
 max(decode( position, 4, column_name, null )) cname4,
 max(decode( position, 5, column_name, null )) cname5,
 max(decode( position, 6, column_name, null )) cname6,
 max(decode( position, 7, column_name, null )) cname7,
 max(decode( position, 8, column_name, null )) cname8,
 count(*) col_cnt
 from (select substr(table_name,1,30) table_name,
 substr(constraint_name,1,30) constraint_name,
 substr(column_name,1,30) column_name,
 position
 from user_cons_columns ) a,
 user_constraints b
 where a.constraint_name = b.constraint_name
 and b.constraint_type = 'R'
 group by b.table_name, b.constraint_name
 ) cons
 where col_cnt > ALL
 ( select count(*)
 from user_ind_columns i
 where i.table_name = cons.table_name
 and i.column_name in (cname1, cname2, cname3, cname4,
 cname5, cname6, cname7, cname8 )
 and i.column_position <= cons.col_cnt
 group by i.index_name
 )
 /


那麼dead會影響效能問題嗎 
metalink
The Performance Impact of Deadlock Detection [ID 285270.1]
有詳細解釋

The sessions waiting to acquire locks are put in a queue and to know in which queue to go
they arranged using a hash function that uses the type of lock and some meaningful values
depending on the operation they want to do. These queues are called "Hash chains" because
they are a chain of requests that fall within the same hash value.

An example is the TM lock; it uses the table object id as part as its hash hey.

if the lock mode that I want to acquire is "compatible" with the current lock the object
has then I'm put in a list of sessions that holds the lock.

An example is again the TM lock; any DDL will request (X) lock in exclusive mode blocking
everyone requesting any lock but if I insert a lock I only request a (S) shared lock allowing
any other session that request a S mode to continue but blocking X modes.

A more detailed explanation of how locks work can be found in the concepts manual.

The Queues are memory structures in the SGA and latches protect them.
They are called "Enqueue Hash Chain" latches. And there is parent and many child latches.

To be put in a queue the session grabs an "enqueue resource" which the structure that has
 the details of the lock and attempts to acquire the latch that protects the Hash chain
associated with the mode and the rest of the special values. After it is acquired then
places the "enqueue resource" in the chain either as holder or waiter depending on
the circumstances and releases the latch.

A session waiting for a lock will have a row in v$session_wait view with the event 'enqueue'
and the rest of the columns will have the details of the kind of lock and lock mode being
requested.

 

Initiating Deadlock Detection
-----------------------------

A session launches deadlock detection (we'll call that session the "Requesting Session")
when it has already a lock and is being forced to wait when requesting another.

The Deadlock detention starts by acquiring the Parent "Enqueue Hash Chain" latch.
By doing that it automatically requests and holds all children "Enqueue Hash Chain" latches
and only releases the latches until the deadlock detection finishes.

One important fact to remember is that the database has many more types of locks besides
table locks and the deadlock detention includes them all, that's why it need to acquire
the parent latch and not only the child latch that belongs to the specific mode.

For example, the requesting session could be holding a TM lock on a table and requesting
the ST lock because it needs to allocate another extent to the table, and could be blocked
by SMON because is doing space consolidation. At that moment the requesting session
does not know that SMON is holding the ST, he only knows someone is holding it so the
requesting session needs to verify that someone is not waiting for the requesting
session's TM lock or that someone is not waiting for another someone that is waiting for
the requesting session's TM lock.

Another important fact is that deadlock detection is only initiated when the lock being
requested is an application's type of lock like TM, TX and UL. The rest of the lock types
are usually for internal or very specific operations which is not possible to encounter
a deadlock.

Some Performance Impact
=======================
Once the Parent and child latches are acquired no one can create or verify any lock until
the latches are released, and if the deadlock detection takes too long it can effectively
feel like an instance hang.

To try to speed things up, the session first check if the immediate owner of the lock
being requested (We'll call this session the "Holding Session") is also waiting on 'enqueue'.

If its not, then it is more likely that there is no immediate deadlock so the deadlock
detection finishes and the latches are released but will try again another detection
later since it is still possible that holding session can wait on one of the detecting
session's locks later or because we are not the next in line to acquire the lock and the
next one in line can be waiting on the requesting session's lock.


Climbing up the Tree
--------------------
If the Holding session is waiting on 'enqueue' then we start with him.
    
it could be that the session was killed or is orphaned in which case we need to wait for
the cleanup, so we wait up to 15 seconds for it to happen otherwise an internal error
ORA-600 [1151] is given.

The requesting session checks on what sessions are blocking the holding session and
if it finds that it is the requesting session or the holding session itself ( because
of an autonomous or recursive transaction) then a deadlock has been found.

For each of the blocking sessions it is necessary to do deadlock detection because
they may be waiting on a lock from another session.

The more locks each session has the more complex the scenarios get and the more recursive
calls are needed.
Also, the more sessions begin to wait on 'enqueue' the more deadlock detections needs
to be done.

More Performance Impact
=======================
Complex applications can easily create multiple locks that causes the suspicious of
deadlocks forcing the sessions to consume resources and blocking other session by
aquiring the latches until their request is satisfied.

Here is an example of how many times the latches can be requested in a normal database
in a 2hr period statspack and not one single deadlock was reported.

Statistic                                      Total     per Second    per Trans
--------------------------------- ------------------ -------------- ------------
enqueue conversions                          611,346           84.9          3.8
enqueue requests                           1,894,232          263.1         11.9

Latch Name                       Requests      Misses      Sleeps Sleeps 1->4
-------------------------- -------------- ----------- ----------- ------------
enqueue hash chains             4,402,265       2,053          39 2014/39/0/0/

To make matters worst, when finally a deadlock is found a trace file is generated
with the deadlock graph and by default dumps the process state of the session that
found the deadlock, while still holding the latches!
So the more complex the application, the more state objects has, the bigger
a process state dump will take, and the longer it will take to release the latches.

For instances where deadlocks are common and the cause is known and unavoidable there
is little point to generate the traces since they become just extra overhead, So to
alleviate this situation patch 2235386 introduced event 10027 that when set to level 1
does not do the process state dump, making the size of the trace smaller and the
release of the latches quicker. Setting the level to 1 does NOT eliminate the overhead
of the deadlock detection algorithm. It only helps to release the latch a bit more quicker
once the algorithm has finished scanning.

On the other hand, if more information is needed by support then it can be set at
Level 2 to generate a System State dump instead but the customer must be aware that
it will take even longer time to release the latches but most probably will be a lot
more useful information than with process state alone.

event="10027 trace name context forever,level 1"~~~~~~~~~~~~~~~~~~~~~~~~~~~解決,不dump

The patch is included starting in patchset 9.2.0.3 and in 10g.


上面詳細解釋了原理,
The Deadlock detention starts by acquiring the Parent "Enqueue Hash Chain" latch.
By doing that it automatically requests and holds all children "Enqueue Hash Chain" latches
and only releases the latches until the deadlock detection finishes.

這部分 意思就是當做死鎖檢測時 持有parenet enqueue hash chains

resouce上有3個chains 分別為持有著的請求,一條轉換(轉換lock狀態,比如剛獲得S 現在想獲得X 那麼就會放子convet佇列 優先順序別 比wait佇列高),等待佇列
若未出現dead lock正常情況下 這個struct結構中佇列很快就變化

 


Another important fact is that deadlock detection is only initiated when the lock being
requested is an application's type of lock like TM, TX and UL. The rest of the lock types
are usually for internal or very specific operations which is not possible to encounter
a deadlock.
從這可以看出 造成dead lock的 通常都是請求的TM ,TX ,UL
User enqueues:

TX (transaction enqueue): This enqueue type occurs if you want to change a data record but you cannot do this because a transaction is running in parallel (for example, because the transaction changed the same data record because a unique or primary constraint cannot be guaranteed or because a free ITL slot is no longer available in the block header). The TX enqueue occurs most frequently in the SAP environment. A session only ever holds one TX enqueue - even if several data records of one or several tables are changed.
TM (DML enqueue): This enqueue type occurs if a complete object has to be protected against changes (for example, as part of an index rebuild or a consistency check using VALIDATE STRUCTURE). Whenever a TX enqueue blocks table entries, a TM enqueue is also set so that parallel activities such as index rebuilds or consistency checks are not possible. ONE TM enqueue is set for each transaction and changed object.
UL (user-defined enqueue): A transaction has set an enqueue using DBMS_LOCK.REQUEST (this is not used in the standard SAP system).

 

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/12020513/viewspace-671719/,如需轉載,請註明出處,否則將追究法律責任。

相關文章