MySQL死鎖案例 – Learn. Write. Repeat.

banq發表於2019-04-26

無論你是DBA還是開發人員,你都會對死鎖感到不耐煩,一些死鎖需要幾天的時間來修復,它們很難重現,其中一些只能在生產prod機器上重現。在不知道發生了什麼情況下盲目修復並不罕見,你只能假設問題出在哪裡,然後在這裡新增更多詳細日誌,最後建立一個補丁並將其投入生產,希望獲得更多資訊,這最近發生在我身上。在這篇文章中,我將嘗試分享關於如何處理mysql中的資料庫死鎖的。

拉一個mysql docker映象並啟動它:

docker pull mysql
 
docker run --name local-mysql -e MYSQL_ROOT_PASSWORD=pass -d mysql:latest


讓我們建立一個結構並導致死鎖。首先,我們需要連線到我們的mysql例項。獲取容器ID

docker ps -a


然後連線:

docker exec -ti 94300e36a1d0 /bin/bash


接下來連線到mysql例項:

mysql -p
Enter password: pass


我們建立一個資料庫:

CREATE SCHEMA test;
USE test;
 
CREATE TABLE `child` (
  `id` int(11) NOT NULL,
  `child_name` varchar(45) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
 
CREATE TABLE `parent` (
  `id` int(11) NOT NULL,
  `parent_name` varchar(45) DEFAULT NULL,
  `child_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `parent_child_fk_idx` (`child_id`),
  CONSTRAINT `parent_child_fk` FOREIGN KEY (`child_id`) REFERENCES `child` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
 
INSERT INTO `test`.`child` (`id`, `child_name`) VALUES ('1', 'child1');
INSERT INTO `test`.`child` (`id`, `child_name`) VALUES ('2', 'child2');
INSERT INTO `test`.`child` (`id`, `child_name`) VALUES ('3', 'child3');
 
INSERT INTO `test`.`parent` (`id`, `parent_name`, `child_id`) VALUES ('1', 'parent1', '1');


現在我們已經插入了資料表結構和一些資料,我們就可以開始了。我們需要啟動兩個資料庫連線,以便我們可以產生死鎖。

第一個:

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)
 
mysql> update child set child_name='child10' where id=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0


第二個:

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)
 
mysql> insert into parent values(4,'parent2',1);


我們回到第一個並寫入:

mysql> delete from parent;
Query OK, 3 rows affected (0.00 sec)


即看到第二階段出現死鎖:

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction


好,讓我們一步一步地進行調查。首先我們得到innodb引擎的狀態。

SHOW ENGINE INNODB STATUS;


顯然我們對輸出的死鎖部分感興趣:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2019-04-23 12:16:31 0x7fddcc0e7700
*** (1) TRANSACTION:
TRANSACTION 2523, ACTIVE 7 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 109, OS thread handle 140590591276800, query id 374587 localhost root update
insert into parent values(4,'parent2',1)
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 4 n bits 72 index PRIMARY of table `test`.`child` trx id 2523 lock mode S locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 4; hex 80000001; asc     ;;
 1: len 6; hex 0000000009d5; asc       ;;
 2: len 7; hex 010000012101ca; asc     !  ;;
 3: len 1; hex 61; asc a;;
 
*** (2) TRANSACTION:
TRANSACTION 2522, ACTIVE 9 sec fetching rows
mysql tables in use 1, locked 1
5 lock struct(s), heap size 1136, 5 row lock(s), undo log entries 3
MySQL thread id 108, OS thread handle 140590588000000, query id 374588 localhost root updating
delete from parent
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 4 n bits 72 index PRIMARY of table `test`.`child` trx id 2522 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 4; hex 80000001; asc     ;;
 1: len 6; hex 0000000009d5; asc       ;;
 2: len 7; hex 010000012101ca; asc     !  ;;
 3: len 1; hex 61; asc a;;
 
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 13 page no 4 n bits 72 index PRIMARY of table `test`.`parent` trx id 2522 lock_mode X waiting
Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex 80000004; asc     ;;
 1: len 6; hex 0000000009db; asc       ;;
 2: len 7; hex 81000000940110; asc        ;;
 3: len 1; hex 32; asc 2;;
 4: len 4; hex 80000001; asc     ;;
 
*** WE ROLL BACK TRANSACTION (1)


我們看到事務一正在等待鎖被授權,有問題的鎖是在子表的主鍵上,它是一個共享(S)鎖。

insert into parent values(4,'parent2',1)
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 4 n bits 72 index PRIMARY of table `test`.`child` trx id 2523 lock mode S locks rec but not gap waiting



再看到下一條資訊:

delete from parent
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 4 n bits 72 index PRIMARY of table `test`.`child` trx id 2522 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 13 page no 4 n bits 72 index PRIMARY of table `test`.`parent` trx id 2522 lock_mode X waiting


第二段事務操作有兩個鎖。它在子表的主鍵上有一個獨佔(X)鎖,它等待父表的主鍵上的獨佔(X)鎖。這樣,第一段事務等待第二段事務釋放鎖,而第二段等待第一段釋放,相互等待,死鎖。

回顧我們的查詢,這非常有意義。首先我們這樣做的:

update child set child_name='child10' where id=1;


這意味著它將在子表記錄上建立一個獨佔(X)鎖。我們可以透過執行驗證

mysql> SELECT object_name, lock_type, lock_mode, lock_data, lock_status FROM performance_schema.data_locks;
+-------------+-----------+---------------+-----------+-------------+
| object_name | lock_type | lock_mode     | lock_data | lock_status |
+-------------+-----------+---------------+-----------+-------------+
| child       | TABLE     | IX            | NULL      | GRANTED     |
| child       | RECORD    | X,REC_NOT_GAP | 1         | GRANTED     |
+-------------+-----------+---------------+-----------+-------------+
2 rows in set (0.00 sec)


現在,當我們嘗試插入帶有child 1的新記錄:

insert into parent values(4,'parent2',1);


我們被先前的獨佔(X)鎖定阻止。由於外來鍵約束, Insert需要子表記錄上的共享(S)鎖:

mysql> SELECT object_name, lock_type, lock_mode, lock_data, lock_status FROM performance_schema.data_locks;
+-------------+-----------+---------------+-----------+-------------+
| object_name | lock_type | lock_mode     | lock_data | lock_status |
+-------------+-----------+---------------+-----------+-------------+
| parent      | TABLE     | IX            | NULL      | GRANTED     |
| child       | TABLE     | IS            | NULL      | GRANTED     |
| child       | RECORD    | S,REC_NOT_GAP | 1         | WAITING     |
| child       | TABLE     | IX            | NULL      | GRANTED     |
| child       | RECORD    | X,REC_NOT_GAP | 1         | GRANTED     |
+-------------+-----------+---------------+-----------+-------------+
5 rows in set (0.00 sec)


當我們這樣做時,這是正常的:

mysql> delete from parent;


等待一個排它(X)鎖的父表。但插入它還在等待共享(S)鎖。此時第二段事務被選為受害者(將被回滾),因此事務會話1獲勝。

mysql> SELECT object_name, lock_type, lock_mode, lock_data, lock_status FROM performance_schema.data_locks;
+-------------+-----------+---------------+------------------------+-------------+
| object_name | lock_type | lock_mode     | lock_data              | lock_status |
+-------------+-----------+---------------+------------------------+-------------+
| child       | TABLE     | IX            | NULL                   | GRANTED     |
| child       | RECORD    | X,REC_NOT_GAP | 1                      | GRANTED     |
| parent      | TABLE     | IX            | NULL                   | GRANTED     |
| parent      | RECORD    | X             | supremum pseudo-record | GRANTED     |
| parent      | RECORD    | X             | 1                      | GRANTED     |
| parent      | RECORD    | X             | 2                      | GRANTED     |
| parent      | RECORD    | X             | 3                      | GRANTED     |
+-------------+-----------+---------------+------------------------+-------------+
7 rows in set (0.00 sec)


所以,我們從這一切中學到了什麼。外來鍵和死鎖並不總是相互理解。插入/更新將透過其外來鍵使用共享鎖(S)鎖定子實體。這意味著只能讀取。
記錄鎖是對索引記錄的鎖定,即使定義了沒有索引的表,記錄鎖也始終鎖定索引記錄。對於這種情況,InnoDB建立一個隱藏的聚簇索引並使用此索引進行記錄鎖定。從MySQL 8.0.1開始,還有一些很好的功能,我將在下面介紹。
跳過鎖:

mysql> insert into parent select 4,'parent2',id from child where id=1 for update skip locked;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0


上面修飾符可用於從表中非確定性地讀取行,同時跳過鎖定的行。我們看到我們沒有等待鎖,並且父程式沒有鎖定獨佔(X)鎖。 這是第二段事務。

mysql> SELECT object_name, lock_type, lock_mode, lock_data, lock_status FROM performance_schema.data_locks;
+-------------+-----------+---------------+-----------+-------------+
| object_name | lock_type | lock_mode     | lock_data | lock_status |
+-------------+-----------+---------------+-----------+-------------+
| child       | TABLE     | IX            | NULL      | GRANTED     |
| child       | RECORD    | X,GAP         | 2         | GRANTED     |
| child       | TABLE     | IX            | NULL      | GRANTED     |
| child       | RECORD    | X,REC_NOT_GAP | 1         | GRANTED     |
+-------------+-----------+---------------+-----------+-------------+
4 rows in set (0.00 sec)


我們在第一段事務會話中進行刪除後得到:

mysql> delete from parent;                                                                        
Query OK, 3 rows affected (0.00 sec)
 
mysql> SELECT object_name, lock_type, lock_mode, lock_data, lock_status FROM performance_schema.data_locks;
+-------------+-----------+---------------+------------------------+-------------+
| object_name | lock_type | lock_mode     | lock_data              | lock_status |
+-------------+-----------+---------------+------------------------+-------------+
| child       | TABLE     | IX            | NULL                   | GRANTED     |
| child       | RECORD    | X,GAP         | 2                      | GRANTED     |
| child       | TABLE     | IX            | NULL                   | GRANTED     |
| child       | RECORD    | X,REC_NOT_GAP | 1                      | GRANTED     |
| parent      | TABLE     | IX            | NULL                   | GRANTED     |
| parent      | RECORD    | X             | supremum pseudo-record | GRANTED     |
| parent      | RECORD    | X             | 1                      | GRANTED     |
| parent      | RECORD    | X             | 2                      | GRANTED     |
| parent      | RECORD    | X             | 3                      | GRANTED     |
+-------------+-----------+---------------+------------------------+-------------+
9 rows in set (0.00 sec)


很簡約。

此外,如果我們不想等待整個50秒(innodb_lock_wait_timeout)釋放鎖定,我們現在可以使用NOWAIT修飾符,它將立即引發錯誤。這是第二段事務會話。

mysql> insert into parent select 4,'parent2',id from child where id=1 for update nowait;
ERROR 3572 (HY000): Statement aborted because lock(s) could not be acquired immediately and NOWAIT is set.
 
mysql> SELECT object_name, lock_type, lock_mode, lock_data, lock_status FROM performance_schema.data_locks;
+-------------+-----------+---------------+-----------+-------------+
| object_name | lock_type | lock_mode     | lock_data | lock_status |
+-------------+-----------+---------------+-----------+-------------+
| child       | TABLE     | IX            | NULL      | GRANTED     |
| child       | TABLE     | IX            | NULL      | GRANTED     |
| child       | RECORD    | X,REC_NOT_GAP | 1         | GRANTED     |
+-------------+-----------+---------------+-----------+-------------+
3 rows in set (0.00 sec)


在會話中刪除後我們得到了:

mysql> delete from parent;                                                                        
Query OK, 3 rows affected (0.00 sec)
 
mysql> SELECT object_name, lock_type, lock_mode, lock_data, lock_status FROM performance_schema.data_locks;
+-------------+-----------+---------------+------------------------+-------------+
| object_name | lock_type | lock_mode     | lock_data              | lock_status |
+-------------+-----------+---------------+------------------------+-------------+
| child       | TABLE     | IX            | NULL                   | GRANTED     |
| child       | TABLE     | IX            | NULL                   | GRANTED     |
| child       | RECORD    | X,REC_NOT_GAP | 1                      | GRANTED     |
| parent      | TABLE     | IX            | NULL                   | GRANTED     |
| parent      | RECORD    | X             | supremum pseudo-record | GRANTED     |
| parent      | RECORD    | X             | 1                      | GRANTED     |
| parent      | RECORD    | X             | 2                      | GRANTED     |
| parent      | RECORD    | X             | 3                      | GRANTED     |
+-------------+-----------+---------------+------------------------+-------------+
8 rows in set (0.00 sec)


再沒有死鎖!

相關文章