Oracle死鎖一例(ORA-00060),鎖表導致的業務死鎖問題
1、問題發現
檢查客戶資料庫的時候發現存在大量死鎖的情況
Thread 1 advanced to log sequence 257 (LGWR switch)
Current log# 16 seq# 257 mem# 0: /oradata/oracle/online_log/redo16_01.log
Current log# 16 seq# 257 mem# 1: /oradata/oracle/online_log/redo16_02.log
Tue Jul 03 10:14:53 2018
Archived Log entry 385 added for thread 1 sequence 256 ID 0x59dc8ffa dest 1:
Tue Jul 03 10:14:53 2018
LNS: Standby redo logfile selected for thread 1 sequence 257 for destination LOG_ARCHIVE_DEST_2
Tue Jul 03 10:19:39 2018
opiodr aborting process unknown ospid (23762) as a result of ORA-609
Tue Jul 03 10:51:18 2018
ORA-00060: Deadlock detected. More info in file /u01/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_25846.trc.
Tue Jul 03 10:54:01 2018
ORA-00060: Deadlock detected. More info in file /u01/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_14067.trc.
Tue Jul 03 11:02:28 2018
ORA-00060: Deadlock detected. More info in file /u01/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_20781.trc.
Tue Jul 03 11:21:13 2018
Thread 1 cannot allocate new log, sequence 258
Private strand flush not complete
檢視trace檔案orcl_ora_25846.trc結果如下
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-026e0020-000001a5 147 4468 X 385 241 S
TM-0007fd6c-00000000 385 241 X 147 4468 SX
session 4468: DID 0001-0093-000001FEsession 241: DID 0001-0181-00000014
session 241: DID 0001-0181-00000014session 4468: DID 0001-0093-000001FE
Rows waited on:
Session 4468: obj - rowid = 0007FD6C - AAAAAAAAAAAAAAAAAA
(dictionary objn - 523628, file - 0, block - 0, slot - 0)
Session 241: no row
----- Information for the OTHER waiting sessions -----
Session 241:
sid: 241 ser: 425 audsid: 24705000 user: 160/FD14
flags: (0x45) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40009) -/-/INC
pid: 385 O/S info: user: oracle, term: UNKNOWN, ospid: 20781
image: oracle@dbserver1
client details:
O/S info: user: TL3050, term: TL3050-WZ, ospid: 5300:5348
machine: WORKGROUPTL3050-WZ program: CWV4.2.8.337_20131204.exe
application name: CWV4.2.8.337_20131204.exe, hash value=580982453
current SQL:
insert into pzd2018
(UNI_NO,ORD,STYPE,STYPE2,SNO,SYEAR,SMONTH,RMONTH,SDAY,SABSTRACT,OPERATOR,J_AMOUNT,D_AMOUNT,SUBJ,SUBJNAME,
OPP_SUBJ,SRC_CODE,ECO_CODE,SRC_PAYTYPE,SRC_BUTYPE,ECO_TYPE,ECO_WARRANT,PRJ_ORDER,PRJ_NAME,OPP_PRJ,CLR_ORDER,
UNIT_CODE,SPECCODE,CONTRACT_NO,CAR_NO,OLPAY_SNO,schedule_date,WB_TYPE,WB_JNUM,WB_DNUM,WB_FACT,NUM_TYPE,
NUM_JNUM,NUM_DNUM,NUM_PRICE,CAP_NO,CAP_ORD,JSFS_CODE,ZPH,BUSS_DATE,OTHER_UNIT,ACNT,BANKNO,ADDRESS1,ADDRESS2,
TNO,ACT_NO,BU_CODE,T_CODE,RESBU_CODE,RESBU_AMT,SPECCODE1,SPECCODE2,SPECCODE3,SPECCODE4,RES_S1,
RES_S2,RES_S3,RES_S4,ASSET_SUBJ,TAX_NO,SRC_NAME,ADDITION,UNI_PRJ_ORDER,Clr_Bu_Code,
Source_Type,Source,SrKey,SMark,Uni_Prj_Name,clrsno,input_name,check_name,attach_act,
attach_act_no,pz_attr,src_type,zj_type,ref_uni_no,charge_sno,charge_name,src_lkx,order_type,c
----- End of information for the OTHER waiting sessions -----
Information for THIS session:
----- Current SQL Statement for this session (sql_id=9ktt36bsngnyx) -----
insert into pz2018
(UNI_NO,STYPE,STYPE2,SNO,SYEAR,SMONTH,RMONTH,SDAY,INPUT_NAME,CHECK_NAME,COMP_NAME,COMP_NAME2,
ADDITION,CHILDNUM,J_AMOUNT,D_AMOUNT,SSTATE,REMARK,PZ_ATTR)
values(:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14,:15,:16,:17,:18,:19)
===================================================
2、問題分析
可以看出來241號會話持有一個TM鎖,在執行insert into pzd2018語句在等待S鎖
4468號會話持有一個TX鎖,在執行insert into pz2018語句,在等待SX鎖
通過與業務溝通與資料庫查詢發現了以下的鎖表操作,並和業務確定了屬於業務SQL
lock table pz2018 in exclusive mode
到這裡問題已經清楚了,整個邏輯是這樣的
241號會話將pz2018全表排他模式進行了鎖定,導致4468會話無法對pz2018表進行insert操作,原因是無法在表上獲取共享排它鎖即SX鎖,導致4468號會話進入等待模式
而4468號會話在等待前進行了insert into pzd2018操作,而241號會話在插入時存在唯一約束,導致241會話進行TX鎖等待,等待4468號session資料提交或者回滾
這樣一個環狀等待就形成了即死鎖
等待發生時會話的等待情況
SQL> select a.sample_time,
2 a.session_id,
3 a.session_serial#,
4 a.blocking_session bsession,
5 a.blocking_session_serial# bserial#,
6 a.event,
7 a.machine,
8 a.module,
9 a.sql_opname
10 from dba_hist_active_sess_history a
11 where a.session_id in (241, 4468, 6819, 10817)
12 and a.sample_time > to_date(`2018070310`, `yyyymmddhh24`)
13 and a.sample_time < to_date(`2018070312`, `yyyymmddhh24`)
14 and a.event is not null
15 order by a.sample_time
16 ;
SAMPLE_TIME SESSION_ID SESSION_SERIAL# BSESSION BSERIAL# EVENT MACHINE MODULE SQL_OPNAME
------------------------------ ---------- --------------- ---------- ---------- ------------------------------ ------------------------------ ------------------------------ ----------------------------------------------------------------
03-7月 -18 10.00.40.857 上午 241 425 6587 7875 enq: TM - contention WORKGROUPTL3050-WZ CWV4.2.8.337_20131204.exe LOCK TABLE
03-7月 -18 10.49.45.384 上午 10817 97 12929 7665 read by other session webserver JDBC Thin Client SELECT
03-7月 -18 10.51.16.143 上午 241 425 4468 2029 enq: TX - row lock contention WORKGROUPTL3050-WZ CWV4.2.8.337_20131204.exe INSERT
03-7月 -18 10.51.16.143 上午 4468 2029 241 425 enq: TM - contention ZDCWWANGH88208561 XCV5(新5.24).exe INSERT
03-7月 -18 10.52.46.903 上午 10817 121 null event dbserver1 SELECT
03-7月 -18 10.53.57.464 上午 6819 99 10817 133 enq: TX - row lock contention ZDCWWANGWD88981612 CWV4.exe INSERT
03-7月 -18 10.53.57.464 上午 10817 133 6819 99 enq: TM - contention ZDCWWANGH88208561 XCV5(新5.24).exe INSERT
03-7月 -18 10.54.07.554 上午 10817 133 db file parallel read ZDCWWANGH88208561 XCV5(新5.24).exe INSERT
8 rows selected
3、鎖等待的模擬,問題復現
---session1
SQL> lock table pz2018 in exclusive mode;
Table(s) Locked.
----session2
SQL> insert into pzd2018(OBJECT_ID,OBJECT_NAME) values(100000000,`PZD_Yong`);
1 row created.
SQL> insert into pz2018(OBJECT_ID,OBJECT_NAME) values(100000000,`PZ_Yong`);
----session2執行直接hang住無法完成,在等待TM鎖
SQL> select s.SID,s.BLOCKING_SESSION bsid,s.EVENT,s.MACHINE,s.MODULE,s.STATUS,s.STATE from v$session s where s.EVENT is not null and s.STATUS=`ACTIVE` and s.WAIT_CLASS<>`Idle`;
SID BSID EVENT MACHINE MODULE STATUS STATE
---------- ---------- ------------------------------ ------------------------------ ---------------------------------------- -------- -------------------
17 143 enq: TM - contention 172-16-8-110 SQL*Plus ACTIVE WAITING
----session1執行
SQL> insert into pzd2018(OBJECT_ID,OBJECT_NAME) values(100000000,`PZD_Yong`);
--session1 直接hang住,session2丟擲錯誤發現死鎖
SQL> insert into pz2018(OBJECT_ID,OBJECT_NAME) values(100000000,`PZ_Yong`);
insert into pz2018(OBJECT_ID,OBJECT_NAME) values(100000000,`PZ_Yong`)
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
----檢查資料庫等待
SQL> set linesize 1000 pagesize 5000
SQL> col EVENT for a30
SQL> col MACHINE for a30
SQL> col MODULE for a40
SQL> select s.SID,s.BLOCKING_SESSION bsid,s.EVENT,s.MACHINE,s.MODULE,s.STATUS,s.STATE from v$session s where s.EVENT is not null and s.STATUS=`ACTIVE` and s.WAIT_CLASS<>`Idle`;
SID BSID EVENT MACHINE MODULE STATUS STATE
---------- ---------- ------------------------------ ------------------------------ ---------------------------------------- -------- -------------------
143 17 enq: TX - row lock contention 172-16-8-110 SQL*Plus ACTIVE WAITING
檢視資料庫alert日誌發現
Thu Jul 05 11:40:40 2018
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/dtstack1/dtstack1/trace/dtstack1_ora_29840.trc.
檢視死鎖trace
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TM-0001554c-00000000 28 17 X 27 143 SX
TX-00040008-000002d9 27 143 X 28 17 S
session 17: DID 0001-001C-00000024 session 143: DID 0001-001B-00000155
session 143: DID 0001-001B-00000155 session 17: DID 0001-001C-00000024
Rows waited on:
Session 17: no row
Session 143: obj - rowid = 0001554C - AAAAAAAAAAAAAAAAAA
(dictionary objn - 87372, file - 0, block - 0, slot - 0)
----- Information for the OTHER waiting sessions -----
Session 143:
sid: 143 ser: 905 audsid: 610017 user: 85/DTYONG
flags: (0x45) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40009) -/-/INC
pid: 27 O/S info: user: oracle, term: UNKNOWN, ospid: 29846
image: oracle@172-16-8-110 (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/1, ospid: 29845
machine: 172-16-8-110 program: sqlplus@172-16-8-110 (TNS V1-V3)
application name: SQL*Plus, hash value=3669949024
current SQL:
insert into pz2018(OBJECT_ID,OBJECT_NAME) values(100000000,`PZ_Yong`)
----- End of information for the OTHER waiting sessions -----
Information for THIS session:
----- Current SQL Statement for this session (sql_id=7jbgf8fc4cac0) -----
insert into pzd2018(OBJECT_ID,OBJECT_NAME) values(100000000,`PZD_Yong`)
===================================================
PROCESS STATE
-------------
Process global information:
process: 0x914c4ed0, call: 0x910d2008, xact: 0x90390710, curses: 0x916aa020, usrses: 0x916aa020
in_exception_handler: no
----------------------------------------
SO: 0x914c4ed0, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x914c4ed0, name=process, file=ksu.h LINE:12721, pg=0
(process) Oracle pid:28, ser:23, calls cur/top: 0x910d2008/0x910d2008
flags : (0x0) -
flags2: (0x0), flags3: (0x10)
intr error: 0, call error: 0, sess error: 0, txn error 0
intr queue: empty
ksudlp FALSE at location: 0
(post info) last post received: 138 0 2
last post received-location: ksl2.h LINE:2374 ID:kslpsr
last process to post me: 0x914b3298 1 6
last post sent: 0 0 26
last post sent-location: ksa2.h LINE:285 ID:ksasnd
last process posted by me: 0x914b3298 1 6
(latch info) wait_event=0 bits=0x0
Process Group: DEFAULT, pseudo proc: 0x915515e8
O/S info: user: oracle, term: UNKNOWN, ospid: 29840
OSD pid info: Unix process pid: 29840, image: oracle@172-16-8-110 (TNS V1-V3)
----------------------------------------
SO: 0x916aa020, type: 4, owner: 0x914c4ed0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x914c4ed0, name=session, file=ksu.h LINE:12729, pg=0
(session) sid: 17 ser: 683 trans: 0x90390710, creator: 0x914c4ed0
flags: (0x45) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40009) -/-/INC
相關文章
- AdornerDecorator的CacheMode繫結和windows鎖屏導致TableControl鎖死問題Windows
- MySQL:死鎖一例MySql
- MySQL死鎖系列-線上死鎖問題排查思路MySql
- MySQL死鎖問題MySql
- 檢視oracle死鎖程式並結束死鎖Oracle
- MySQL鎖等待與死鎖問題分析MySql
- 解決Oracle死鎖問題步驟Oracle
- MySQL 死鎖問題分析MySql
- 作業系統(5) 死鎖的概念 死鎖產生的必要條件 死鎖的處理策略 預防死鎖 避免死鎖 死鎖的檢測和解除 銀行家演算法作業系統演算法
- Python | 淺談併發鎖與死鎖問題Python
- SpringBoot Seata 死鎖問題排查Spring Boot
- 死鎖
- MySQL死鎖案例一(回滾導致死鎖)MySql
- MySQL死鎖案例二(自增列導致死鎖)MySql
- 線上問題排查:記一次 Redis Cluster Pipeline 導致的死鎖問題Redis
- 什麼是死鎖?如何解決死鎖?
- 死鎖和可重入鎖
- MySQL 死鎖和鎖等待MySql
- 面試:什麼是死鎖,如何避免或解決死鎖;MySQL中的死鎖現象,MySQL死鎖如何解決面試MySql
- 線上併發事務死鎖問題排查
- 面試官:什麼是死鎖?怎麼排查死鎖?怎麼避免死鎖?面試
- 死鎖概述
- 鎖的使用與死鎖的避免
- 例項詳解 Java 死鎖與破解死鎖Java
- MySQL/InnoDB中,樂觀鎖、悲觀鎖、共享鎖、排它鎖、行鎖、表鎖、死鎖概念的理解MySql
- SQLServer的死鎖分析(1):頁鎖SQLServer
- SQL SERVER死鎖查詢,死鎖分析,解鎖,查詢佔用SQLServer
- MySQL:一個死鎖分析 (未分析出來的死鎖)MySql
- 併發技術5:死鎖問題
- [作業系統]死鎖作業系統
- 死鎖是什麼?如何預防和避免死鎖?
- Java 中的死鎖Java
- SQL Server 的死鎖SQLServer
- GCD 死鎖原因GC
- 死鎖案例二
- 死鎖案例三
- 併發:死鎖
- 遭遇ITL死鎖