MySQL資料庫錯誤server_errno=2013的解決

junsansi發表於2011-01-11

一組MySQL複製環境中的Master意外掉電,重啟後Master執行正常,但該複製環境中的其它slave端,Error Log中卻丟擲的如下錯誤資訊:

    110110 15:21:25 [ERROR] Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013)
    110110 15:21:25 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'forummysql01-bin.002937' position 243387731
    110110 15:21:25 [Note] Slave: connected to master 'repl@192.168.1.31:3306',replication resumed in log 'forummysql01-bin.002937' at position 243387731
    110110 15:21:25 [ERROR] Error reading packet from server: Client requested master to start replication from impossible position ( server_errno=1236)
    110110 15:21:25 [ERROR] Got fatal error 1236: 'Client requested master to start replication from impossible position' from master when reading data from binary log
    110110 15:21:25 [Note] Slave I/O thread exiting, read up to log 'forummysql01-bin.002937', position 243387731

通過mysql命令列連線到slave端,執行show slave status檢視複製狀態:

    mysql> show slave status\G
    *************************** 1. row ***************************
    Slave_IO_State:
    Master_Host: 192.168.1.31
    Master_User: repl
    Master_Port: 3306
    Connect_Retry: 60
    Master_Log_File: forummysql01-bin.002937
    Read_Master_Log_Pos: 243387731
    Relay_Log_File: phpmysql02-relay-bin.33417576
    Relay_Log_Pos: 243387875
    Relay_Master_Log_File: forummysql01-bin.002937
    Slave_IO_Running: No
    Slave_SQL_Running: Yes
    Replicate_Do_DB:
    Replicate_Ignore_DB: mysql
    Replicate_Do_Table:
    Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
    Replicate_Wild_Ignore_Table:
    Last_Errno: 0
    Last_Error:
    Skip_Counter: 0
    Exec_Master_Log_Pos: 243387731
    Relay_Log_Space: 243387875
    Until_Condition: None
    Until_Log_File:
    Until_Log_Pos: 0
    Master_SSL_Allowed: No
    Master_SSL_CA_File:
    Master_SSL_CA_Path:
    Master_SSL_Cert:
    Master_SSL_Cipher:
    Master_SSL_Key:
    Seconds_Behind_Master: NULL

Salve的io執行緒沒有執行,看起來是接收日誌出現了問題,嘗試啟動該執行緒:

    mysql> start slave io_thread;
    Query OK, 0 rows affected (0.00 sec)

    mysql> show slave status\G
    *************************** 1. row ***************************
    Slave_IO_State:
    Master_Host: 192.168.1.31
    Master_User: repl
    Master_Port: 3306
    Connect_Retry: 60
    Master_Log_File: forummysql01-bin.002937
    Read_Master_Log_Pos: 243387731
    Relay_Log_File: phpmysql02-relay-bin.33417576
    Relay_Log_Pos: 243387875
    Relay_Master_Log_File: forummysql01-bin.002937
    Slave_IO_Running: No
    Slave_SQL_Running: Yes
    Replicate_Do_DB:
    Replicate_Ignore_DB: mysql
    Replicate_Do_Table:
    Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
    Replicate_Wild_Ignore_Table:
    Last_Errno: 0
    Last_Error:
    Skip_Counter: 0
    Exec_Master_Log_Pos: 243387731
    Relay_Log_Space: 243387875
    Until_Condition: None
    Until_Log_File:
    Until_Log_Pos: 0
    Master_SSL_Allowed: No
    Master_SSL_CA_File:
    Master_SSL_CA_Path:
    Master_SSL_Cert:
    Master_SSL_Cipher:
    Master_SSL_Key:
    Seconds_Behind_Master: NULL
    1 row in set (0.00 sec)

看起來 沒有反應,其中是有反映,執行啟動io執行緒的命令後,Error Log檔案中又丟擲了日誌檔案位置異常的資訊。看來還是得到master端,檢視一下報錯的日誌檔案指定位置到底執行的什麼操作,以及該位置是否存在?

通過mysqlbinlog命令可以檢視二進位制日誌檔案中的內容,在master端執行命令如下:

    [root@forummysql01 data]# mysqlbinlog --start-position=243387732 forummysql01-bin.002937
    /*!40019 SET @@session.max_insert_delayed_threads=0*/;
    /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
    DELIMITER /*!*/;
    DELIMITER ;
    # End of log file
    ROLLBACK /* added by mysqlbinlog */;
    /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

還別說,這個位置看起來啥都沒有做,穩妥起見,三思將整個forummysql01-bin.002937檔案中的內容均提取出來檢視一下,再次執行mysqlbinlog命令,這次不再指定position:

    [root@forummysql01 data]# mysqlbinlog ./forummysql01-bin.002937 > /home/jss/bin-002937.log

我們只需要檢視一下該檔案最後幾行的資訊即可,例如:

    [root@forummysql01 data]# tail -50 /home/jss/bin-002937.log
    .............................
    # at 243297123
    #110110 15:02:19 server id 1 end_log_pos 243297459 Query thread_id=1773644066 exec_time=0 error_code=0
    SET TIMESTAMP=1294642939/*!*/;
    INSERT INTO cdb_sessions (sid, ip1, ip2, ip3, ip4, uid, username, groupid, styleid, invisible, action, lastactivity, lastolupdate, seccode, fid, tid)
    VALUES ('HQFzjy', '202', '160', '180', '187', '0', '', '7', '1', '0', '3', '1294642939', '0', '232485', '27', '4583')
    /*!*/;
    ................
    ................
    ................

    # at 243308840
    #110110 15:02:20 server id 1 end_log_pos 243315309 Query thread_id=1773638971 exec_time=0 error_code=0
    SET TIMESTAMP=1294642940/*!*/;
    update group_topic set TOPIC_TIT.............................
    /*!*/;
    DELIMITER ;
    # End of log file
    ROLLBACK /* added by mysqlbinlog */;
    /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

可以看到該bin檔案中最後的位置點是243315309,與錯誤日誌中“'forummysql01-bin.002937', position 243387731”相差較大,提示的錯誤點在二進位制日誌檔案中確實不存在,我將其理解為邏輯錯誤,應該是由於master意外掉電,重新啟動時自動flush了binlog,而slave並未獲取到這個資訊導致,因此解決該問題也比較簡單,直接重置同步的master位置應該就可以。這裡三思選擇將日誌檔案序號遞增(也可以選擇將position位置號提前),執行命令如下:

    mysql> stop slave;
    Query OK, 0 rows affected (0.00 sec)
    mysql> CHANGE MASTER TO MASTER_HOST='192.168.1.101',
    -> MASTER_PORT=3306,
    -> MASTER_USER='repl',
    -> MASTER_PASSWORD='******',
    -> MASTER_LOG_FILE='forummysql01-bin.002938',
    -> MASTER_LOG_POS=0;

    Query OK, 0 rows affected (0.01 sec)

然後再重新啟動slave,檢視狀態:

    mysql> start slave;
    Query OK, 0 rows affected (0.00 sec)
    mysql> show slave status\G
    *************************** 1. row ***************************
    Slave_IO_State: Waiting for master to send event
    Master_Host: 192.168.1.31
    Master_User: repl
    Master_Port: 3306
    Connect_Retry: 60
    Master_Log_File: forummysql01-bin.002938
    Read_Master_Log_Pos: 35910271
    Relay_Log_File: phpmysql02-relay-bin.000003
    Relay_Log_Pos: 21407790
    Relay_Master_Log_File: forummysql01-bin.002938
    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
    Replicate_Do_DB:
    Replicate_Ignore_DB: mysql
    Replicate_Do_Table:
    Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
    Replicate_Wild_Ignore_Table:
    Last_Errno: 0
    Last_Error:
    Skip_Counter: 0
    Exec_Master_Log_Pos: 21407646
    Relay_Log_Space: 35910415
    Until_Condition: None
    Until_Log_File:
    Until_Log_Pos: 0
    Master_SSL_Allowed: No
    Master_SSL_CA_File:
    Master_SSL_CA_Path:
    Master_SSL_Cert:
    Master_SSL_Cipher:
    Master_SSL_Key:
    Seconds_Behind_Master: 2215
    1 row in set (0.00 sec)

Slave相關程式已啟動,Error Log檔案中也沒有再丟擲錯誤資訊。等待一段時間,讓slave趕上master的進度,其它slave也參照此步驟操作,整個複製環境就恢復了。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7607759/viewspace-683607/,如需轉載,請註明出處,否則將追究法律責任。

相關文章