[bug]MySQL 雜湊掃描錯誤地應用日誌導致主從中斷

資料庫工作筆記發表於2023-10-31

來源:DBA爛筆頭

資料庫版本:5.7.21,遇見一個資料同步的bug,主庫呼叫一個函式操作dml主從異常中斷。

復現如下:




















































































































































































1、建表插入資料
CREATE TABLE `s1` (
`id` int(11) NOT NULL DEFAULT '0' COMMENT '',
`max` int(11) NOT NULL COMMENT ''
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC;

insert into s1 values(1,999999);

2、建立函式
DELIMITER $$
CREATE DEFINER=`root`@`%` FUNCTION `nextTrxSerno`() RETURNS varchar(30) CHARSET utf8
BEGIN
DECLARE v_id INT (11) ;
DECLARE v_max INT (11) ;
DECLARE EXIT HANDLER FOR SQLEXCEPTION RETURN '' ;


UPDATE
s1 s SET
s.id = s.id + 1 ;


SELECT
s.id,
s.max INTO v_id,
v_max
FROM
s1 s ;


IF v_id >= v_max - 10 THEN SET v_id = 1 ; END IF ;


UPDATE
s1 s SET
s.id = v_id + 1 ;


RETURN CONCAT(DATE_FORMAT(SYSDATE(), '%Y%m%d%H%i%S'), LPAD(v_id, 8, '0')) ;


END $$
DELIMITER ;

3、呼叫函式
select nextTrxSerno();

4、檢視結果:
主庫:
mysql> select * from s1;
+----+--------+
| id | max    |
+----+--------+
|  3 | 999999 |
+----+--------+
1 row in set (0.00 sec)


備庫:
mysql> select * from s1;
+----+--------+
| id | max    |
+----+--------+
|  1 | 999999 |
+----+--------+
1 row in set (0.00 sec)

備庫:mysql> show slave status\G;*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: 172.17.134.50                  Master_User: repl                  Master_Port: 3305                Connect_Retry: 60              Master_Log_File: mysql_bin.000305          Read_Master_Log_Pos: 8857               Relay_Log_File: relay.000002                Relay_Log_Pos: 6992        Relay_Master_Log_File: mysql_bin.000305             Slave_IO_Running: Yes            Slave_SQL_Running: No              Replicate_Do_DB:          Replicate_Ignore_DB:           Replicate_Do_Table:       Replicate_Ignore_Table:      Replicate_Wild_Do_Table:  Replicate_Wild_Ignore_Table: sys.%,performance_schema.%,information_schema.%                   Last_Errno: 1032                   Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'b034d235-6773-11ee-afe7-00163e04944a:3984450' at master log mysql_bin.000305, end_log_pos 8826. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.                 Skip_Counter: 0          Exec_Master_Log_Pos: 8520              Relay_Log_Space: 7526              Until_Condition: None               Until_Log_File:                Until_Log_Pos: 0           Master_SSL_Allowed: No           Master_SSL_CA_File:           Master_SSL_CA_Path:              Master_SSL_Cert:            Master_SSL_Cipher:               Master_SSL_Key:        Seconds_Behind_Master: NULLMaster_SSL_Verify_Server_Cert: No                Last_IO_Errno: 0                Last_IO_Error:               Last_SQL_Errno: 1032               Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'b034d235-6773-11ee-afe7-00163e04944a:3984450' at master log mysql_bin.000305, end_log_pos 8826. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.  Replicate_Ignore_Server_Ids:             Master_Server_Id: 16708                  Master_UUID: b034d235-6773-11ee-afe7-00163e04944a             Master_Info_File: mysql.slave_master_info                    SQL_Delay: 0          SQL_Remaining_Delay: NULL      Slave_SQL_Running_State:           Master_Retry_Count: 86400                  Master_Bind:      Last_IO_Error_Timestamp:     Last_SQL_Error_Timestamp: 231027 21:13:43               Master_SSL_Crl:           Master_SSL_Crlpath:           Retrieved_Gtid_Set: b034d235-6773-11ee-afe7-00163e04944a:3984428-3984450            Executed_Gtid_Set: b034d235-6773-11ee-afe7-00163e04944a:1-3984449                Auto_Position: 1         Replicate_Rewrite_DB:                 Channel_Name:           Master_TLS_Version: 1 row in set (0.00 sec)
ERROR: No query specified


經分析,這是一個 MySQL bug。 見:

[bug]MySQL 雜湊掃描錯誤地應用日誌導致主從中斷

報錯原理:

假設沒有主鍵和唯一索引的表t1有{r[1],r[2],r[3],...r[x],...,r[y],...},當crc32(r[x])=crc32(r[y])時滿足,對r[y]的更新將應用於從機上的r[x]。這會導致主機和從機的r[x]和r[y]值不一致然後在主伺服器上,執行刪除/更新r[x]或r[y]將導致HA_ERR_END_OF_FILE(1032)錯誤

(重要)規避方法: 沒有主鍵的表新增主鍵 (函式操作的表沒有主鍵)

避演示:(見截圖)

主庫:[bug]MySQL 雜湊掃描錯誤地應用日誌導致主從中斷

從庫:[bug]MySQL 雜湊掃描錯誤地應用日誌導致主從中斷

在MySQL 8.0.25中測試無該bug,尚不清楚該bug在官方什麼版本被修復的。如果有知道的也可以告知

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/70027826/viewspace-2991990/,如需轉載,請註明出處,否則將追究法律責任。

相關文章