GreatSQL 死鎖案例分析

GreatSQL發表於2024-04-19

1.背景概述

客戶業務發生死鎖的報錯,根據業務程式日誌及業務流程,發現造成死鎖的原因是:事務1 delete + insert ,事務2 delete + insert 2個事務交替執行導致的死鎖;由於GAP鎖阻塞了插入意向鎖,並且當delete的資料存在時死鎖不會發生,當delete的資料不存在時,會發生死鎖。

2.問題復現

本次測試基於 GreatSQL-8.0.32-24,隔離級別為 RR

2.1 建立測試表

greatsql> create database test;

greatsql> create table test(c1 int unique key, c2 int, c3 int, c4 int);

greatsql> insert into test values (1,1,1,1),(3,3,3,3),(5,5,5,5),(9,9,9,9);

greatsql> select * from test;
+------+------+------+------+
| c1  | c2  | c3  | c4  |
+------+------+------+------+
|   1 |   1 |   1 |   1 |
|   3 |   3 |   3 |   3 |
|   5 |   5 |   5 |   5 |
|   9 |   9 |   9 |   9 |
+------+------+------+------+
4 rows in set (0.01 sec)

2.2 事務執行順序

按以下事務執行順序,如果要刪除的資料存在,則不會發生死鎖;如果要刪除的資料不存,並且要刪除的資料在同一個GAP鎖的區間內則會發生死鎖;

時間 事務1 事務2
T1 BEGIN; BEGIN;
T2 delete from test where c1=?;
T3 delete from test where c1=?;
T4 insert into test value(?,?,?,?);
T5 insert into test value(?,?,?,?);

2.3 當delete的資料存在時

事務1:delete

greatsql> begin;
Query OK, 0 rows affected (0.00 sec)

greatsql> delete from test where c1=3;
Query OK, 1 row affected (0.00 sec)

檢視鎖資訊:

greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME    | LOCK_TYPE | LOCK_MODE   | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
|     59 | test      | test     | NULL     | TABLE   | IX       | GRANTED   | NULL        |
|     59 | test      | test     | c1       | RECORD   | X,REC_NOT_GAP | GRANTED   | 3, 0x000000000201 |
|     59 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000201   |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
3 rows in set (0.00 sec)

此時事務1給 3, 0x000000000201 這條資料加了 記錄鎖 X,REC_NOT_GAP

事務2:delete

greatsql> begin;
Query OK, 0 rows affected (0.00 sec)

greatsql> delete from test where c1=5;
Query OK, 1 row affected (0.00 sec)

檢視鎖資訊:

greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME    | LOCK_TYPE | LOCK_MODE   | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
|     57 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |
|     57 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 5, 0x000000000202 |
|     57 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000202   |
|     59 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |
|     59 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 3, 0x000000000201 |
|     59 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000201   |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
6 rows in set (0.00 sec)

此時事務2給 5, 0x000000000202 這條資料加了 記錄鎖 X,REC_NOT_GAP

事務1:insert

greatsql> insert into test value(3,3,3,3);
Query OK, 1 row affected (0.00 sec)

檢視鎖資訊:

greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+

| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME    | LOCK_TYPE | LOCK_MODE   | LOCK_STATUS | LOCK_DATA     |

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+

|     57 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |

|     57 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 5, 0x000000000202 |

|     57 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000202   |

|     59 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |

|     59 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 3, 0x000000000201 |

|     59 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000201   |

|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 3, 0x000000000201 |

|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 5, 0x000000000202 |

|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 3, 0x000000000206 |

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
9 rows in set (0.00 sec)

此時事務1給 被delete刪除的資料 3, 0x000000000201 ,插入的資料 3, 0x000000000202,以及相鄰的下一條資料 5, 0x000000000202 加了 間隙鎖 S,GAP

事務2:insert

greatsql> insert into test value(5,5,5,5);
Query OK, 1 row affected (0.01 sec)

檢視鎖資訊:

greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME    | LOCK_TYPE | LOCK_MODE   | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
|     57 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |
|     57 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 5, 0x000000000202 |
|     57 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000202   |
|     57 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 5, 0x000000000202 |
|     57 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 9, 0x000000000203 |
|     57 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 5, 0x000000000207 |
|     59 | test      | test     | NULL       | TABLE   | IX       | GRANTED   | NULL        |
|     59 | test      | test     | c1        | RECORD   | X,REC_NOT_GAP | GRANTED   | 3, 0x000000000201 |
|     59 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,REC_NOT_GAP | GRANTED   | 0x000000000201   |
|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 3, 0x000000000201 |
|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 5, 0x000000000202 |
|     59 | test      | test     | c1        | RECORD   | S,GAP     | GRANTED   | 3, 0x000000000206 |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
12 rows in set (0.00 sec)

此時事務2給 被delete刪除的資料 5, 0x000000000202 ,插入的資料 5, 0x000000000207,以及相鄰的下一條資料 9, 0x000000000203 加了 間隙鎖 S,GAP

由於GAP鎖之間是相互相容的,所以沒有發生鎖等待及死鎖,此時事務1,事務2都執行完成,可以正常提交。

2.4 當delete的資料不存在時

事務1:delete

greatsql> begin;
Query OK, 0 rows affected (0.01 sec)

greatsql> delete from test where c1=6;
Query OK, 0 rows affected (0.00 sec)

檢視鎖資訊:

greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
|     61 | test      | test     | NULL    | TABLE   | IX     | GRANTED   | NULL        |
|     61 | test      | test     | c1     | RECORD   | X,GAP   | GRANTED   | 9, 0x000000000203 |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
2 rows in set (0.00 sec)

此時事務1給 9, 0x000000000203 這條資料加了 間隙鎖 X,GAP

事務2:delete

greatsql> begin;
Query OK, 0 rows affected (0.00 sec)
 
greatsql> delete from test where c1=7;
Query OK, 0 rows affected (0.00 sec)

檢視鎖資訊:

greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
|     62 | test      | test     | NULL    | TABLE   | IX     | GRANTED   | NULL        |
|     62 | test      | test     | c1     | RECORD   | X,GAP   | GRANTED   | 9, 0x000000000203 |
|     61 | test      | test     | NULL    | TABLE   | IX     | GRANTED   | NULL        |
|     61 | test      | test     | c1     | RECORD   | X,GAP   | GRANTED   | 9, 0x000000000203 |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
4 rows in set (0.00 sec)

此時事務2給 9, 0x000000000203 這條資料加了 間隙鎖 X,GAP 間隙鎖可以相互相容,因此沒有報錯

事務1:insert

greatsql> insert into test value(6,6,6,6);
---hang住,處於鎖等待

檢視鎖資訊:

greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE        | LOCK_STATUS | LOCK_DATA     |
+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+
|     62 | test      | test     | NULL    | TABLE   | IX           | GRANTED   | NULL        |
|     62 | test      | test     | c1     | RECORD   | X,GAP          | GRANTED   | 9, 0x000000000203 |
|     61 | test      | test     | NULL    | TABLE   | IX           | GRANTED   | NULL        |
|     61 | test      | test     | c1     | RECORD   | X,GAP          | GRANTED   | 9, 0x000000000203 |
|     61 | test      | test     | c1     | RECORD   | X,GAP,INSERT_INTENTION | WAITING   | 9, 0x000000000203 |
+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+
5 rows in set (0.00 sec)

此時事務1,要申請給 9, 0x000000000203 這條資料加 間隙鎖,插入意向鎖 X,GAP,INSERT_INTENTION;由於事務2已經加了 間隙鎖 X,GAP 間隙鎖與插入意向鎖並不相容,因此事務1的insert處於鎖等待狀態

事務2:insert

greatsql> insert into test value(7,7,7,7);
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

檢視鎖資訊:

greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;

+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME    | LOCK_TYPE | LOCK_MODE        | LOCK_STATUS | LOCK_DATA        |
+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+
|     61 | test      | test     | NULL       | TABLE   | IX           | GRANTED   | NULL          |
|     61 | test      | test     | c1        | RECORD   | X,GAP          | GRANTED   | 9, 0x000000000203    |
|     61 | test      | test     | c1        | RECORD   | X,GAP          | GRANTED   | 6, 0x000000000213    |
|     61 | test      | test     | GEN_CLUST_INDEX | RECORD   | X            | GRANTED   | supremum pseudo-record |
|     61 | test      | test     | GEN_CLUST_INDEX | RECORD   | X,GAP          | GRANTED   | 0x000000000213     |
|     61 | test      | test     | c1        | RECORD   | X,GAP,INSERT_INTENTION | GRANTED   | 9, 0x000000000203    |
+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+
6 rows in set (0.01 sec)

事務2回滾,只有事務1的加鎖資訊。

由於此時事務2,要申請給 9, 0x000000000203 這條資料加 間隙鎖,插入意向鎖由於事務1已經加了 間隙鎖 X,GAP 間隙鎖與插入意向鎖並不相容,因此事務2的insert處於鎖等待狀態,2個事務相互等待鎖導致死鎖,此時事務2回滾。

3.總結

此次死鎖的發生主要是GAP 鎖 和 插入意向鎖的衝突,建議讓業務修改一下邏輯,先判斷資料是否存在 select 一下,存在的話 delete 後在 insert ; 不存在的話直接 insert 不用delete了。


Enjoy GreatSQL 😃

關於 GreatSQL

GreatSQL是適用於金融級應用的國內自主開源資料庫,具備高效能、高可靠、高易用性、高安全等多個核心特性,可以作為MySQL或Percona Server的可選替換,用於線上生產環境,且完全免費併相容MySQL或Percona Server。

相關連結: GreatSQL社群 Gitee GitHub Bilibili

GreatSQL社群:

社群部落格有獎徵稿詳情:https://greatsql.cn/thread-100-1-1.html

image-20230105161905827

技術交流群:

微信:掃碼新增GreatSQL社群助手微信好友,傳送驗證資訊加群

image-20221030163217640

相關文章