MySQL案例01:Last_SQL_Errno: 1755 Cannot execute the current event group in the parallel mode

Rangle發表於2018-03-18

週五同事監控報警,有個MySQL從庫複製狀態異常,讓我幫忙排查下,經過排查發現是MySQL5.6並行複製的一個Bug所致,具體處理過程如下:

一、錯誤資訊

登入mysql從庫伺服器,檢查複製狀態

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: xx.xx.xx.xx
                  Master_User: repuser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.001689
          Read_Master_Log_Pos: 145325388
               Relay_Log_File: mysql2-relay-bin.004390
                Relay_Log_Pos: 147850306
        Relay_Master_Log_File: mysql-bin.001673
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1755
                   Last_Error: Cannot execute the current event group in the parallel mode. Encountered event Format_desc, relay-log name ./mysql2-relay-bin.004392, position 238 which prevents execution of this event group in parallel mode. Reason: possible malformed group of events from an old master.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 147850096
              Relay_Log_Space: 4440639819
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1755
               Last_SQL_Error: Cannot execute the current event group in the parallel mode. Encountered event Format_desc, relay-log name ./mysql2-relay-bin.004392, position 238 which prevents execution of this event group in parallel mode. Reason: possible malformed group of events from an old master.
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 46
                  Master_UUID: f6d575b8-6a5b-11e5-8313-0050568d23c9
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 180314 01:01:17
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: f6d575b8-6a5b-11e5-8313-0050568d23c9:9393159-70288025
            Executed_Gtid_Set: 33a835bb-6a5a-11e5-8308-0050568d685b:1-146,
f6d575b8-6a5b-11e5-8313-0050568d23c9:1-69584557
                Auto_Position: 1
1 row in set (0.01 sec)

mysql錯誤日誌內容如下

018-03-14 01:01:16 26349 [Note] Event Scheduler: Loaded 0 events
2018-03-14 01:01:16 26349 [Note] /data/mysql/bin/mysqld: ready for connections.
Version: '5.6.16-log'  socket: '/tmp/mysql.sock'  port: 3306  Source distribution
2018-03-14 01:01:17 26349 [ERROR] Slave SQL: Cannot execute the current event group in the parallel mode. Encountered event Format_desc, relay-log name ./mysql2-relay-bin.004392, position 238 which prevents execution of this event group in parallel mode. Reason: possible malformed group of events from an old master. Error_code: 1755
2018-03-14 01:01:17 26349 [Warning] Slave: Cannot execute the current event group in the parallel mode. Encountered event Format_desc, relay-log name ./mysql2-relay-bin.004392, position 238 which prevents execution of this event group in parallel mode. Reason: possible malformed group of events from an old master. Error_code: 1755
2018-03-14 01:01:17 26349 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.001673' position 147850096

二、錯誤原因

從錯誤日誌字面意思理解,出現這個錯誤是因為資料庫開啟了並行複製,一些event group在並行複製下,不支援從而導致複製失敗,主從狀態異常。

檢查資料庫版本:

mysql>  select version();
+------------+
| version()  |
+------------+
| 5.6.16-log |
+------------+
1 row in set (0.07 sec)

檢查資料庫並行複製資訊:

mysql> show variables like '%parall%';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| slave_parallel_workers | 1     |
+------------------------+-------+
1 row in set (0.05 sec)

官方搜尋錯誤資訊,發現此錯誤是一bug,具體bug編號為Bug #71495、Bug #72537

https://bugs.mysql.com/bug.php?id=71495

https://bugs.mysql.com/bug.php?id=72537

三、解決方案

1、既然是bug,通過分析binlog內容,然後跳過的方式跳過事物

(非gtid模式,通過set sql_slave_skip_counter=1;跳過事務)

主庫分析binlog:

mysqlbinlog -vv --base64-output='decode-rows'   --start-position=147850096  --stop-position=147850096   mysql-bin.001674  >a.sql

SHOW BINLOG EVENTS in 'mysql-bin.001673' from 147850096  limit 10;
從庫跳過事物(gtid模式)

stop slave;

SET @@SESSION.GTID_NEXT= 'f6d575b8-6a5b-11e5-8313-0050568d23c9:9393159-70288025';

begin;

commit;

SET SESSION GTID_NEXT = AUTOMATIC;

start slave;

2、通過關閉並行複製來恢復(最佳方法)

stop slave;

set global slave_parallel_workers = 0 ;

start slave;

(  更改my.cnf中並行複製引數slave_parallel_workers = 0)

3、如果上述方法都不行,可以使用重新部署從庫的方式來恢復,恢復後關閉並行複製(5.6並行複製存在bug,不推薦使用)

如果使用mysqldump備份,最好指定引數--set-gtid-purged=off,否則需要充值gtid

主庫備份資料庫:

innobackupexp方式:innobackupex  --socket=/tmp/mysql.sock  /20180316/

從庫恢復資料庫:

innobackupex  --redo-only --apply-log /soft/2018-03-16_17-45-30
innobackupex  --copy-back /soft/2018-03-16_17-45-30

檢視事物應用binlog資訊:cat xtrabackup_info

binlog_pos = filename 'mysql-bin.001690', position 46997344, GTID of the last change 'f6d575b8-6a5b-11e5-8313-0050568d23c9:1-70316241'

innodb_from_lsn = 0
innodb_to_lsn = 84446557453

設定從庫gtid資訊

set global gtid_purged='f6d575b8-6a5b-11e5-8313-0050568d23c9:1-70316241';  

重新部署從庫:

change master to master_host='xx.xx.xx.xx',master_user='usernamer',master_password='password',MASTER_AUTO_POSITION=84446557453;

(上述使用了reset master,否則可以直接指定從MASTER_AUTO_POSITION=1開始)

相關文章