select for update語句造成ORA-00060 deadlock死鎖問題分析
測試環境出現問題
今天 ,收到專案組通知說發生了死鎖,讓我查一下死鎖出現的原因 。
首先
,登入資料庫,檢視trace日誌所在路徑
-
SYS @ cams > show parameter dump ;
-
NAME TYPE VALUE
-
------------------------------------ ----------- ------------------------------
-
background_core_dump string partial
-
background_dump_dest string / u01 / app / oracle / diag / rdbms / cams / cams / trace
-
core_dump_dest string / u01 / app / oracle / diag / rdbms / cams / cams / cdump
-
max_dump_file_size string unlimited
-
shadow_core_dump string PARTIAL
-
user_dump_dest string / u01 / app / oracle / diag / rdbms / cams / cams / trace
檢視alert 日誌,搜尋 00060
-
[ oracle@db trace ] $ cd / u01 / app / oracle / diag / rdbms / cams / cams / trace /
-
[ oracle@db trace ] $ vi alert_cams . log
找到 報錯trace日誌如下:
-
ORA-00060 : Deadlock detected . More info in file / u01 / app / oracle / diag / rdbms / cams / cams / trace / cams_ora_12850 . trc .
-
Tue Sep 12 13 : 28 : 31 2017
-
ORA-00060 : Deadlock detected . More info in file / u01 / app / oracle / diag / rdbms / cams / cams / trace / cams_ora_12854 . trc .
-
Tue Sep 12 13 : 28 : 36 2017
-
ORA-00060 : Deadlock detected . More info in file / u01 / app / oracle / diag / rdbms / cams / cams / trace / cams_ora_12822 . trc .
-
Tue Sep 12 13 : 31 : 12 2017
開啟 其中一個 trace檔案,部分內容如下 :
-
Trace file / u01 / app / oracle / diag / rdbms / cams / cams / trace / cams_ora_12850 . trc
-
Oracle Database 11g Enterprise Edition Release 11 . 2 . . 4 . 0 - 64bit Production
-
With the Partitioning , OLAP , Data Mining and Real Application Testing options
-
ORACLE_HOME = / u01 / app / oracle / product / 11 . 2 . / db_1
-
System name : Linux
-
Node name : db
-
Release : 2 . 6 . 32-431 . el6 . x86_64
-
Version : # 1 SMP Sun Nov 10 22 : 19 : 54 EST 2013
-
Machine : x86_64
-
VM name : VMWare Version : 6
-
Instance name : cams
-
Redo thread mounted by this instance : 1
-
Oracle process number : 293
-
Unix process pid : 12850 , image : oracle@db
-
* * * 2017-09-12 13 : 28 : 28 . 401
-
* * * SESSION ID : ( 1139 . 47765 ) 2017-09-12 13 : 28 : 28 . 401
-
* * * CLIENT ID : ( ) 2017-09-12 13 : 28 : 28 . 401
-
* * * SERVICE NAME : ( SYS$USERS ) 2017-09-12 13 : 28 : 28 . 401
-
* * * MODULE NAME : ( JDBC Thin Client ) 2017-09-12 13 : 28 : 28 . 401
-
* * * ACTION NAME : ( ) 2017-09-12 13 : 28 : 28 . 401
-
-
-
* * * 2017-09-12 13 : 28 : 28 . 401
-
DEADLOCK DETECTED ( ORA-00060 )
-
-
[ Transaction Deadlock ]
-
這裡指出死鎖不是Oracle自身的錯誤,是使用者的設計造成的問題
-
The following deadlock is not an ORACLE error . It is a
-
deadlock due to user error in the design of an application
-
or from issuing incorrect ad-hoc SQL . The following
-
information may aid in determining the deadlock :
-
-
這個死鎖圖表明兩個session各持有一份資源,等待對方持有的資源
-
Deadlock graph :
-
---------Blocker ( s ) - - - - - - - - ---------Waiter ( s ) - - - - - - - - -
-
Resource Name process session holds waits process session holds waits
-
TX-00160018-00000adc 293 1139 X 384 65 X
-
TX-0004000f-000064d6 384 65 X 293 1139 X
-
-
session 1139 : DID 0001-0125-000074A5 session 65 : DID 0001-0180-000037EC
-
session 65 : DID 0001-0180-000037EC session 1139 : DID 0001-0125-000074A5
-
-
Rows waited on :
-
Session 1139 : obj - rowid = 00021C7B - AAAhx7AAHAAARMrAAB
-
( dictionary objn - 138363 , file - 7 , block - 70443 , slot - 1 )
-
Session 65 : obj - rowid = 00021C7B - AAAhx7AAHAAARMtAAD
-
( dictionary objn - 138363 , file - 7 , block - 70445 , slot - 3 )
-
-
----- Information for the OTHER waiting sessions -----
-
Session 65 :
-
sid : 65 ser : 27869 audsid : 3026187 user : 111 / CAMS_CORE
-
flags : ( 0x45 ) USR / - flags_idl : ( 0x1 ) BSY / - / - / - / - / -
-
flags2 : ( 0x40009 ) - / - / INC
-
pid : 384 O / S info : user : oracle , term : UNKNOWN , ospid : 12854
-
image : oracle@db
-
client details :
-
O / S info : user : cams , term : unknown , ospid : 1234
-
machine : yy program : JDBC Thin Client
-
application name : JDBC Thin Client , hash value = 2546894660
-
current SQL :
-
導致死鎖的SQL語句1
-
select
-
-
SEQ_NO , ACO_AC , AC_DTE , CCY , BAL , FRZ_AMT ,
-
CRE_TLR , CRE_DTE ,
-
CRE_BR , CRE_BK , UPD_TLR , UPD_DTE , UPD_BR , UPD_BK , TSL_AMT , VPS_BAL
-
-
FROM CD_ACBALANCE
-
WHERE ACO_AC = : 1
-
FOR UPDATE
-
-
----- End of information for the OTHER waiting sessions -----
-
-
Information for THIS session :
-
-
----- Current SQL Statement for this session ( sql_id = cfy88pmyts0fn ) -----
-
導致死鎖的SQL語句2
-
select
-
-
SEQ_NO , ACO_AC , AC_DTE , CCY , BAL , FRZ_AMT ,
-
CRE_TLR , CRE_DTE ,
-
CRE_BR , CRE_BK , UPD_TLR , UPD_DTE , UPD_BR , UPD_BK , TSL_AMT , VPS_BAL
-
-
FROM CD_ACBALANCE
-
WHERE ACO_AC = : 1
-
FOR UPDATE
-
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
這裡可以明確的是 select for update語句造成了死鎖。但是考慮到環境為測試環境,沒開啟歸檔,不能透過logminer挖掘故障期間的SQL事物,而且開發人員已經透過kill session的方式處理了死鎖,所以需要在別的環境中重現該問題,並進行分析。
重現死鎖問題
首先連上資料庫,啟用HR使用者
-
[ oracle@prod ~ ] $ sqlplus / as sysdba
-
SQL * Plus : Release 11 . 2 . . 4 . 0 Production on Wed Sep 13 03 : 14 : 17 2017
-
Copyright ( c ) 1982 , 2013 , Oracle . All rights reserved .
-
Connected to an idle instance .
-
SYS@PROD > startup ;
-
ORACLE instance started .
-
Total System Global Area 1202556928 bytes
-
Fixed Size 2252704 bytes
-
Variable Size 771752032 bytes
-
Database Buffers 419430400 bytes
-
Redo Buffers 9121792 bytes
-
Database mounted .
-
Database opened .
-
SYS@PROD > conn hr / hr
-
ERROR :
-
ORA-28000 : the account is locked
-
Warning : You are no longer connected to ORACLE .
-
@ > conn / as sysdba
-
Connected .
-
SYS@PROD > alter user hr identified by hr account unlock ;
-
User altered .
-
SYS@PROD > conn hr / hr
-
Connected .
選中COUNTRIES表作為測試物件
-
HR@PROD > desc countries ;
-
Name Null ? Type
-
----------------------------------------- -------- ----------------------------
-
COUNTRY_ID NOT NULL CHAR ( 2 )
-
COUNTRY_NAME VARCHAR2 ( 40 )
-
REGION_ID NUMBER
-
HR@PROD > select count ( * ) from countries ;
-
COUNT ( * )
-
----------
-
25
-
HR@PROD > select count ( distinct ( COUNTRY_ID ) ) from countries ;
-
COUNT ( DISTINCT ( COUNTRY_ID ) )
-
---------------------------
-
25
經過驗證,COUNTRIES表中的COUNTRY_ID欄位是唯一的。
用COUNTRIES表新建一個test表,在test表中做測試
-
HR@PROD > create table test as select * from countries ;
-
Table created .
-
HR@PROD > desc test
-
Name Null ? Type
-
----------------------------------------- -------- ----------------------------
-
COUNTRY_ID NOT NULL CHAR ( 2 )
-
COUNTRY_NAME VARCHAR2 ( 40 )
-
REGION_ID NUMBER
-
HR@PROD > select country_id , country_name , region_id from countries ;
-
CO COUNTRY_NAME REGION_ID
-
-- ---------------------------------------- ----------
-
AR Argentina 2
-
AU Australia 3
-
BE Belgium 1
-
BR Brazil 2
-
CA Canada 2
-
CH Switzerland 1
-
CN China 3
-
DE Germany 1
-
DK Denmark 1
-
EG Egypt 4
-
FR France 1
-
CO COUNTRY_NAME REGION_ID
-
-- ---------------------------------------- ----------
-
IL Israel 4
-
IN India 3
-
IT Italy 1
-
JP Japan 3
-
KW Kuwait 4
-
ML Malaysia 3
-
MX Mexico 2
-
NG Nigeria 4
-
NL Netherlands 1
-
SG Singapore 3
-
UK United Kingdom 1
-
CO COUNTRY_NAME REGION_ID
-
-- ---------------------------------------- ----------
-
US United States of America 2
-
ZM Zambia 4
-
ZW Zimbabwe 4
-
25 rows selected .
情形1:
兩個select for update語句想要同時鎖定一條語句。
Session1:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'AU' for update ;
-
CO COUNTRY_NAME REGION_ID
-
-- ---------------------------------------- ----------
-
AU Australia 3
Sessions2:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'AU' for update ;
這時,session2會一直處於鎖等待狀態,而不會出現死鎖。當把第一個事物commit或者rollback之後,第二個事物會繼續執行。
Session1:
-
HR@PROD > rollback ;
-
Rollback complete .
Session2:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'AU' for update ;
-
CO COUNTRY_NAME REGION_ID
-
-- ---------------------------------------- ----------
-
AU Australia
顯然,兩個select for update語句想要同時鎖定一條語句,並不會出現死鎖,而會出現鎖等待的現象。
情形2:
第一個select for update事物鎖定A row後,想要再鎖定B row;第二個select for update事物鎖定B row後,想要鎖定A row。
Session1:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'AU' for update ;
-
CO COUNTRY_NAME REGION_ID
-
-- ---------------------------------------- ----------
-
AU Australia 3
Session 2:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'UK' for update ;
-
CO COUNTRY_NAME REGION_ID
-
-- ---------------------------------------- ----------
-
UK United Kingdom 1
如果Session1想要繼續持有Session2正在持有的row:
Session1:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'UK' for update ;
這時,會出現鎖等待的現場,和上一場景類似。如果Session2也想持有Session1正在持有的row:
Session2:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'AU' for update ;
這時,Session2還處於等待狀態,但是Session1出現死鎖:
Session1:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'UK' for update ;
-
select country_id , country_name , region_id from test where country_id = 'UK' for update
-
*
-
ERROR at line 1 :
-
ORA-00060 : deadlock detected while waiting for resource
這時,死鎖的現象已經重現,而且,場景2從頭到尾只使用了一個select for update語句,只是換了引數而已。
定位死鎖語句
透過檢查資料庫表,能夠檢查出是哪一條語句被死鎖,產生死鎖的機器是哪一臺。
1)用dba使用者執行以下語句
-
select c . owner , c . object_name , c . object_type , b . sid ,
-
b . serial# , b . lockwait , b . status , b . osuser , b . machine , b . process , b . program
-
from v$locked_object a ,
-
v$ session b ,
-
dba_objects c
-
where b . sid = a . session_id
-
and a . object_id = c . object_id ;
如果有輸出的結果,則說明有死鎖,且能看到死鎖的機器是哪一臺。欄位說明:
OWNER:死鎖語句所用的資料庫使用者,這裡是HR使用者。
OBJECT_NAME:產生死鎖的物件,這裡是TEST表。
OBJECT_TYPE:產生死鎖的物件型別,這裡是TABLE。
SID:SESSION標識,常用於連線 其它 列 。
SERIAL#: SID有可能會重複,當兩個session的SID重複時,SERIAL#用來區別session
LOCKWAIT: 可以透過這個欄位查詢出當前正在等待的鎖的相關資訊 ,如果有內容表示被死鎖或者有鎖等待事件。
STATUS: 用來判斷session狀態。 ACTIVE :正執行 。 INACTIVE :等待操作。 KILLED :被標註為殺死。
OSUSER:客戶端作業系統使用者名稱。
MACHINE:客戶端作業系統的機器名。
PROCESS:客戶端程式的ID。
PROGRAM:客戶端執行的應用程式。
2)用dba使用者執行以下語句,可以檢視到被死鎖的語句。
-
select sql_text from v$sql where ( address , hash_value ) in
-
( select sql_address , sql_hash_value from v$ session where sid in
-
( select session_id from v$locked_object ) ) ;
或者
-
select sql_text from v$sql where ( address , hash_value ) in
-
( select sql_address , sql_hash_value from v$ session where lockwait is not null ) ;
這裡查出來出現死鎖的語句和之前的測試結果一致。
3) 分析trace日誌檔案
死鎖發生時,可以再trace日誌檔案中找到如下語句:
-
ORA-00060 : Deadlock detected . More info in file / u01 / app / oracle / diag / rdbms / prod / PROD / trace / PROD_ora_4852 . trc .
開啟trace檔案,可以找到死鎖的語句:
-
----- Information for the OTHER waiting sessions -----
-
Session 31 :
-
sid : 31 ser : 25 audsid : 110280 user : 84 / HR
-
flags : ( 0x45 ) USR / - flags_idl : ( 0x1 ) BSY / - / - / - / - / -
-
flags2 : ( 0x40009 ) - / - / INC
-
pid : 23 O / S info : user : oracle , term : UNKNOWN , ospid : 4853
-
image : oracle@prod ( TNS V1-V3 )
-
client details :
-
O / S info : user : oracle , term : pts / 1 , ospid : 3149
-
machine : prod program : sqlplus@prod ( TNS V1-V3 )
-
application name : SQL * Plus , hash value = 3669949024
-
current SQL :
-
select country_id , country_name , region_id from test where country_id = 'AU' for update
-
-
----- End of information for the OTHER waiting sessions -----
-
-
Information for THIS session :
-
-
----- Current SQL Statement for this session ( sql_id = 6n5kjs2twrwwq ) -----
-
select country_id , country_name , region_id from test where country_id = 'UK' for update
-
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
解決死鎖問題
方法一(殺程式)
1)查詢死鎖的程式:
-
select sid , serial# , username , command , lockwait , osuser from v$ session where lockwait is not null ;
2)kill掉這個死鎖的程式:
-
alter system kill session 'sid, serial#' ;
這裡執行語句為:alter system kill session '29,69';
原先造成死鎖的程式被killed。
3) 如果還不能解決,使用殺系統程式的方式處理:
這裡為了測試,再次重現了死鎖,並使用殺程式方式進行處理。
-
select p . spid from v$ session s , v$process p where s . sid = XXX and s . paddr = p . addr ;
其中,XXX使用第一步查出來的SID引數替代
這裡查出來的程式號為:4257
-
[ oracle@prod ~ ] $ ps -ef | grep 4257
-
oracle 4257 3149 0 07 : 08 ? 00 : 00 : 00 oraclePROD ( DESCRIPTION = ( LOCAL = YES ) ( ADDRESS = ( PROTOCOL = beq ) ) )
-
oracle 4321 4000 0 07 : 21 pts / 4 00 : 00 : 00 grep 4257
-
[ oracle@prod ~ ] $ kill -9 4257
-
[ oracle@prod ~ ] $ ps -ef | grep 4257
-
oracle 4327 4000 0 07 : 21 pts / 4 00 : 00 : 00 grep 4257
殺完程式之後,造成死鎖的程式被殺死
查不到死鎖程式
方法二(重啟庫)
-
SYS@PROD > shutdown immediate ;
-
Database closed .
-
Database dismounted .
-
ORACLE instance shut down .
-
SYS@PROD > startup ;
-
ORACLE instance started .
-
Total System Global Area 1202556928 bytes
-
Fixed Size 2252704 bytes
-
Variable Size 771752032 bytes
-
Database Buffers 419430400 bytes
-
Redo Buffers 9121792 bytes
-
Database mounted .
-
Database opened .
檢視死鎖程式:
方法三(commit || rollback)
Session1:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'AU' for update ;
-
CO COUNTRY_NAME REGION_ID
-
-- ---------------------------------------- ----------
-
AU Australia 3
Session2:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'UK' for update ;
-
CO COUNTRY_NAME REGION_ID
-
-- ---------------------------------------- ----------
-
UK United Kingdom 1
Session1:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'UK' for update ;
-
Waiting……
Session2:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'AU' for update ;
-
Waiting……
Session1:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'UK' for update ;
-
select country_id , country_name , region_id from test where country_id = 'UK' for update
-
*
-
ERROR at line 1 :
-
ORA-00060 : deadlock detected while waiting for resource
這裡Session1出現死鎖,只要執行commit或者rollback就可以解除死鎖,只不過事務中第一個SQL執行成功,第二個SQL執行失敗。
Session1:
-
HR@PROD > commit ;
-
Commit complete .
Session2:
-
HR@PROD > select country_id , country_name , region_id from test where country_id = 'AU' for update ;
-
CO COUNTRY_NAME REGION_ID
-
-- ---------------------------------------- ----------
-
AU Australia 3
此時,死鎖狀態解除:
問題總結
最後,透過與開發人員交流,得出的結論是出現該問題的原因不是程式設計的問題,而是在開發環境中,有人透過debug模式連到伺服器上進行程式碼除錯,有人透過客戶端的形式訪問伺服器上的應用,當兩者同時除錯某一功能時(透過select for update的語句進行一張表中的資料訪問),在特殊的情況下,出現了死鎖的情況。
當然,在開發環境中比較容易出現這類情況,如果需要避免,就需要使用該開發環境的開發人員與測試人員做好溝通了。對於容易出現鎖表的功能,可以要求測試人員在測試環境中測試,除特殊情況,儘量減少在開發環境中測試的次數,以免鎖表影響開發測試進度。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/31394774/viewspace-2144941/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 記一次 MySQL select for update 死鎖問題MySql
- 【死鎖】ORA-00060: deadlock detected while waiting for resourceWhileAI
- 測試庫死鎖診斷DEADLOCK DETECTED ( ORA-00060 )
- SQL update select語句SQL
- Oracle死鎖一例(ORA-00060),鎖表導致的業務死鎖問題Oracle
- ORA-00060: Deadlock detected 模擬死鎖產生與解決方案
- 【Mysql】兩條select for update引起的死鎖MySql
- MySQL 死鎖問題分析MySql
- Sqlserver分析死鎖問題SQLServer
- 線上死鎖問題分析
- oracle deadlock死鎖trace file分析之一Oracle
- PostgreSQL模擬兩個update語句死鎖-利用掃描方法SQL
- 死鎖_DeadLock_示例
- 關於 SAP HANA 資料庫的死鎖問題(deadlock)資料庫
- MySQL鎖等待與死鎖問題分析MySql
- 如何透過sid查詢造成死鎖的sql語句,並殺死會話SQL會話
- 如何通過sid查詢造成死鎖的sql語句,並殺死會話SQL會話
- 【DEADLOCK】Oracle“死鎖”模擬Oracle
- oracle 檢視死鎖語句Oracle
- oracle deadlock死鎖trace file分析之一增補Oracle
- 由select for update鎖等待問題引發的深入思考
- SQLite語句(二):INSERT DELETE UPDATE SELECTSQLitedelete
- 死鎖(ora-00060)以及死鎖相關的知識點
- 這樣分析一個死鎖問題
- SQL 語句select top 變數問題SQL變數
- DBeaver如何生成select,update,delete,insert語句delete
- [Java]一個DeadLock(死鎖)的例子Java
- ORACLE UPDATE 語句語法與效能分析Oracle
- MySQL死鎖問題MySql
- MySQL 5.7 SELECT ... LOCK IN SHARE MODE|FOR UPDATE語句說明MySql
- sqlserver大數批次update時死鎖的問題及解決方案SQLServer
- MySQL死鎖系列-線上死鎖問題排查思路MySql
- rman會造成delete死鎖嗎delete
- 使用oracle 10704 event分析獲取鎖lock及死鎖deadlock系列九Oracle
- Java 死鎖(DeadLock)例項分析和預防[base jdk8]JavaJDK
- 基於10.2.0.1 rac deadlock死鎖 trace file分析_增補二
- RDSSQLServer死鎖(Deadlock)系列之三自動部署Profiler捕獲死鎖SQLServer
- 死鎖問題總結