mysql複製報錯案例處理

longqidong發表於2014-10-17
上班收到報警郵件,線上一臺slave複製報錯導致備份失敗,檢視複製狀態:
mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Host: 192.xx.xxx.xxx
                  Master_User: xxx
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: xxxx-bin.000554
          Read_Master_Log_Pos: 184008708
               Relay_Log_File: relay-bin.000577
                Relay_Log_Pos: 164421592
        Relay_Master_Log_File:xxx-bin.000538
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1594
                   Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 164421444
              Relay_Log_Space: 17364819187
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1594
               Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 2018128
1 row in set (0.00 sec)

根據報錯提示,slave sql 程式解析relay log異常。根據提示使用mysqlbinlog 工具解析對應的relay log,同樣報異常,很顯然relay log檔案損壞。
透過複製重新配置予以解決,但是這個slave本身落後master有一段距離,從show slave status看有幾個log_file,和pos。重新配置複製到底從哪個
log_file和pos開始配置呢,先看相應引數的含義:
Master_Log_File
The name of the master binary log file from which the I/O thread is currently reading.
slave的IO執行緒當前正在讀取的master二進位制日誌檔名。
Relay_Master_Log_File
The name of the master binary log file containing the most recent event executed by the SQL thread.
slave的Sql執行緒最近執行的master二進位制日誌檔名。(該檔案有可能是滯後於IO執行緒正在讀取的二進位制日誌檔案)
Read_Master_Log_Pos
The position in the current master binary log file up to which the I/O thread has read.
Exec_Master_Log_Pos
The position in the current master binary log file to which the SQL thread has read and executed, marking the start of the next transaction or event to be processed. You can use this value with the CHANGE MASTER TO statement's MASTER_LOG_POS option when starting a new slave from an existing slave, so that the new slave reads from this point. The coordinates given by (Relay_Master_Log_File, Exec_Master_Log_Pos) in the master's binary log correspond to the coordinates given by (Relay_Log_File, Relay_Log_Pos) in the relay log.
slave的Sql執行緒已經讀並且執行的master二進位制日誌檔案的位置,標記下一個被執行的事務或事件的開始位置。


原則上,如果slave不落後master的話,Master_Log_File和Relay_Master_Log_File 應該是一致的,而且Read_Master_Log_Pos 和Exec_Master_Log_Pos 也應該是一致的。不落後的情況下,我們使用Master_Log_File,Read_Master_Log_Pos
重新配置複製即可。
但是這裡已經落後,很顯然不能使用上面兩個值來配置複製了。那我們就可以用目前slave SQL執行緒執行到的log和pos也就是(Relay_Master_Log_File、Exec_Master_Log_Pos)來進行配置,
從這個位置重新去主庫拉取日誌。當然之前傳過來的日誌會自動被清理掉。


因此這裡就使用下面的語句進行復制重新配置:
change master to master_host='192.xxx.xxx.xxx',master_port=3306, master_user='xxx',master_password='xxxx',master_log_file='xxx-bin.000538',master_log_pos=164421444;


start slave後,依舊落後很多。慢慢追吧。故障得以處理。


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/22418990/viewspace-1302120/,如需轉載,請註明出處,否則將追究法律責任。

相關文章