問題:昨天線上出現報錯導致一個功能無法執行,查詢日誌發現是mysql的死鎖問題。
分析問題:其實解決問題最大的難點在於分析問題找到出現問題出現在哪裡,這個過程花費的時間和思考是最多的,而使用程式碼解決問題反而很快速。
(1)報錯日誌如下:
2021-10-09 18:59:04] local.INFO: RobotAuction-SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction (SQL: select id
, cash
, freeze
, area_id
from users
where users
.id
in (2634, 2662, 2672, 2673, 2675, 2685, 2808, 2811, 2818, 2834, 2869, 2886, 2926, 2961, 2962, 2981, 3066, 3080, 3124, 3131, 3135) and users
.deleted_at
is null for update)-669
從報錯日誌上去檢視msyql的死鎖日誌,再結合業務上的可能的操作進行分析
(2)死鎖日誌:
2021-10-09 18:59:04 0x150d81d79700
* (1) TRANSACTION:
TRANSACTION 385220, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 11 lock struct(s), heap size 1136, 19 row lock(s)
MySQL thread id 1376113, OS thread handle 23147762767616, query id 23440050 localhost 127.0.0.1 lpt Sending data
select `id`, `cash`, `freeze`, `area_id` from `users` where `users`.`id` in (2634, 2662, 2672, 2673, 2675, 2685, 2808, 2811, 2818, 2834, 2869, 2886, 2926, 2961, 2962, 2981, 3066, 3080, 3124, 3131, 3135) and `users`.`deleted_at` is null
for update
(1) WAITING FOR THIS LOCK TO BE GRANTED: 【1】持有id = 2675的鎖,等待id = 3124 的事務鎖
RECORD LOCKS space id 191 page no 14 n bits 120 index PRIMARY of table lpt
.users
trx id 385220 lock_mode X locks rec but not gap waiting
Record lock, heap no 49 PHYSICAL RECORD: n_fields 52; compact format; info bits 0
** 省略部分
* (2) TRANSACTION:
TRANSACTION 385219, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
7 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 6
MySQL thread id 1376112, OS thread handle 23147757147904, query id 23440106 localhost 127.0.0.1 lpt statistics
select `id`, `cash` from `users` where `users`.`id` = 2675 and `users`.`deleted_at` is null limit 1 for update
(2) HOLDS THE LOCK(S): 【2】持有id=3124的鎖
RECORD LOCKS space id 191 page no 14 n bits 120 index PRIMARY of table lpt
.users
trx id 385219 lock_mode X locks rec but not gap
Record lock, heap no 49 PHYSICAL RECORD: n_fields 52; compact format; info bits 0
** 省略部分
(2) WAITING FOR THIS LOCK TO BE GRANTED: 【3】等待 id = 2675的資料釋放鎖
RECORD LOCKS space id 191 page no 6 n bits 120 index PRIMARY of table lpt
.users
trx id 385219 lock_mode X locks rec but not gap waiting
Record lock, heap no 32 PHYSICAL RECORD: n_fields 52; compact format; info bits 0
** 省略部分
* WE ROLL BACK TRANSACTION (1) 【4】選擇開銷小的進行回滾,選擇了事務(1),對照前面的任務未執行,回滾了
問題所在:
1、使用的鎖都是排他鎖
2、這典型的加鎖順序不同造成的死鎖
3、分析過程【1】【2】【3】【4】
解析辦法
1、通過分析業務,調整業務上的順序來解決 ✔
2、通過表鎖來解決,卻會犧牲效能並且不能使用索引 ×