【Mysql】Slave_IO_Running: No---Got fatal error 1236 from master

小亮520cl發表於2015-09-24

本地MySQL環境,是兩臺MySQL做M-M複製。今天發現錯誤資訊:

mysql 5.5.28-log> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: 88.88.88.88
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: testdbbinlog.000005
          Read_Master_Log_Pos: 98359687
               Relay_Log_File: mysql-relay-bin.000020
                Relay_Log_Pos: 4
        Relay_Master_Log_File: testdbbinlog.000005
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 98359687
              Relay_Log_Space: 107
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
1 row in set (0.00 sec)

mysql 5.5.28-log>


參考網上文件: http://blog.csdn.net/billfanggs/article/details/8905991

在source那邊,執行:

flush logs;
show master status;        

記下File, Position。             ---記錄一個新的binlog

在target端,執行:

CHANGE MASTER TO MASTER_LOG_FILE='testdbbinlog.000008',MASTER_LOG_POS=107;   ---從新的binlog開始複製,跳過舊的binlog。
slave start;
show slave status \G

一切正常。 但是會有資料丟失,可以檢查後重新搭建從庫






如果使用了GTID新的特新
(原英文地https://www.percona.com/blog/2013/02/08/how-to-createrestore-a-slave-using-gtid-replication-in-mysql-5-6/
How to restore a slave in a bad and fast way(快速但不好的方法)

Let’s imagine that our slave has been down for several days and the binary logs from the master have been purged. This is the error we are going to 

點選(此處)摺疊或開啟

  1. Slave_IO_Running: No
  2. Slave_SQL_Running: Yes
  3. Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'



  1. master > show global variables like 'GTID%';
  2. +---------------+-------------------------------------------+
  3. | Variable_name | Value |
  4. +---------------+-------------------------------------------+
  5. | gtid_executed | 9a511b7b-7059-11e2-9a24-08002762b8af:1-14 |
  6. +---------------+-------------------------------------------+
 |gtid_purged | 9a511b7b-7059-11e2-9a24-08002762b8af:2 |


And we set it on the slave:

點選(此處)摺疊或開啟

  1. slave> set global GTID_EXECUTED="9a511b7b-7059-11e2-9a24-08002762b8af:2"
  2. ERROR 1238 (HY000): Variable 'gtid_executed' is a read only variable


Error! Remember, we get the GTID_EXECUTED from the master and set is as GTID_PURGED on the slave

點選(此處)摺疊或開啟

  1. slave1 > set global GTID_PURGED="9a511b7b-7059-11e2-9a24-08002762b8af:2";
  2. ERROR 1840 (HY000): GTID_PURGED can only be set when GTID_EXECUTED is empty.


Error again, GTID_EXECUTED should be empty before changing GTID_PURGED manually but we can’t change it with SET because is a read only variable. The only way to change it is with reset master (yes, on a slave server):

  1. slave1> reset master;
  2. slave1 > show global variables like 'GTID_EXECUTED';
  3. +---------------+-------+
  4. | Variable_name | Value |
  5. +---------------+-------+
  6. | gtid_executed | |
  7. +---------------+-------+
  8. slave1 > set global GTID_PURGED="9a511b7b-7059-11e2-9a24-08002762b8af:2";   可透過這篇文章來熟悉一下 http://www.cnblogs.com/cenalulu/p/4309009.html  
  9. 思路:將丟失的binlog放進purge,表示已刪除,跳過報錯的,英文原版的有點特殊:剛做個slave就報錯,slave的purge已經到13了,所以使用的是execute14
  10. 生產上一般是執行過程中報錯,因為主庫的binarylog已經被purge 但是還沒應用到slave,可以我們可以檢視一下 master與slave的purge,很明顯slave的purge絕壁落後著


  11. slave1> start slave io_thread;
  12. slave1> show slave status\G
  13. [...]
  14. Slave_IO_Running: Yes
  15. Slave_SQL_Running: Yes
  16. [...]


How to restore a slave in a good and slow way(慢但是比較好的方法)

The good way is mysqldump again. We take a dump from the master like we saw before and try to restore it on the slave

  1. slave1 [localhost] {msandbox} ((none)) > source test.sql; ---匯入搭建時的全備
  2. [...]
  3. ERROR 1840 (HY000): GTID_PURGED can only be set when GTID_EXECUTED is empty.
  4. [...]

Wop! It is important to mention that these kind of error messages can dissapear on the shell buffer because the restore of the dump will continue. Be cautious.

Same problem again so same solution too:

點選(此處)摺疊或開啟

  1. slave1> reset master;
  2. slave1> source test.sql; ----重新到slave庫導搭建是的那個全備一遍。。相當於搭建master-slave中的那一步
  3. slave1> start slave;
  4. slave1> show slave status\G
  5. [...]
  6. Slave_IO_Running: Yes
  7. Slave_SQL_Running: Yes
  8. [...]


模擬實驗

  1. slave :stop slave,此時Master_Log_File: mysqlbin.000011 Relay_Master_Log_File: mysqlbin.000011 都停留在binlog 11這個日誌上

  2. master:
  3. (root@127.0.0.1) [test]> insert into t1 select * from t1;                                                           --------11 binlog裡面記錄了insert8條的記錄
    Query OK, 8 rows affected (0.00 sec)
    Records: 8  Duplicates: 0  Warnings: 0


    (root@127.0.0.1) [test]> flush logs;
    Query OK, 0 rows affected (0.02 sec)


    (root@127.0.0.1) [test]> insert into t1 select * from t1;
    Query OK, 16 rows affected (0.00 sec)
    Records: 16  Duplicates: 0  Warnings: 0


    (root@127.0.0.1) [test]> show master status;                                                                          -------12 binlog 裡面記錄了insert 16條的記錄
    +-----------------+----------+--------------+------------------+------------------------------------------+
    | File            | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                        |
    +-----------------+----------+--------------+------------------+------------------------------------------+
    | mysqlbin.000012 |      567 |              |                  | 8a0d06d7-91cb-11e5-843b-00163ec09859:1-7 |
    +-----------------+----------+--------------+------------------+------------------------------------------+
    1 row in set (0.00 sec)


    (root@127.0.0.1) [test]> flush logs;
    Query OK, 0 rows affected (0.01 sec)


    (root@127.0.0.1) [test]> insert into t1 select * from t1;                                                              ------13 binlog裡面記錄了insert 32條的記錄
    Query OK, 32 rows affected (0.01 sec)
    Records: 32  Duplicates: 0  Warnings: 0


    (root@127.0.0.1) [test]> flush logs;
    Query OK, 0 rows affected (0.02 sec)


    (root@127.0.0.1) [test]> show master status;
    +-----------------+----------+--------------+------------------+------------------------------------------+
    | File            | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                        |
    +-----------------+----------+--------------+------------------+------------------------------------------+
    | mysqlbin.000014 |      191 |              |                  | 8a0d06d7-91cb-11e5-843b-00163ec09859:1-8 |
    +-----------------+----------+--------------+------------------+------------------------------------------+            
    1 row in set (0.00 sec)


    (root@127.0.0.1) [test]> purge master logs to 'mysqlbin.000012';     ---記錄8條的binlog沒了
    Query OK, 0 rows affected (0.01 sec)


 slave:
 (root@127.0.0.1) [(none)]> start slave;
Query OK, 0 rows affected (0.01 sec)


(root@127.0.0.1) [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 218.78.186.162
                  Master_User: repluser
                  Master_Port: 3308
                Connect_Retry: 60
              Master_Log_File: mysqlbin.000011
          Read_Master_Log_Pos: 191
               Relay_Log_File: mysql-relay-bin.000014
                Relay_Log_Pos: 399
        Relay_Master_Log_File: mysqlbin.000011
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 191
              Relay_Log_Space: 689
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 13308
                  Master_UUID: 8a0d06d7-91cb-11e5-843b-00163ec09859
             Master_Info_File: /data/mydata/3309/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 160519 15:06:15
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 8a0d06d7-91cb-11e5-843b-00163ec09859:1-5
            Executed_Gtid_Set: 8a0d06d7-91cb-11e5-843b-00163ec09859:1-5
                Auto_Position: 1
1 row in set (0.00 sec)


ERROR: 
No query specified


解決辦法:
master:
(root@127.0.0.1) [test]> show variables like 'gtid%';
+---------------+------------------------------------------+
| Variable_name | Value                                    |
+---------------+------------------------------------------+
| gtid_executed |                                          |
| gtid_mode     | ON                                       |
| gtid_next     | AUTOMATIC                                |
| gtid_owned    |                                          |
| gtid_purged   | 8a0d06d7-91cb-11e5-843b-00163ec09859:1-6 |
+---------------+------------------------------------------+

slave:
(root@127.0.0.1) [(none)]> reset master;
Query OK, 0 rows affected (0.02 sec)
(root@127.0.0.1) [(none)]> set global gtid_purged='8a0d06d7-91cb-11e5-843b-00163ec09859:1-6';
Query OK, 0 rows affected (0.02 sec)


(root@127.0.0.1) [(none)]> start slave;
Query OK, 0 rows affected (0.00 sec)
(root@127.0.0.1) [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 218.78.186.162
                  Master_User: repluser
                  Master_Port: 3308
                Connect_Retry: 60
              Master_Log_File: mysqlbin.000014
          Read_Master_Log_Pos: 191
               Relay_Log_File: mysql-relay-bin.000019
                Relay_Log_Pos: 415
        Relay_Master_Log_File: mysqlbin.000014
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 191
              Relay_Log_Space: 721
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 13308
                  Master_UUID: 8a0d06d7-91cb-11e5-843b-00163ec09859
             Master_Info_File: /data/mydata/3309/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 8a0d06d7-91cb-11e5-843b-00163ec09859:1-5:7-8     ---出現斷層了,8a0d06d7-91cb-11e5-843b-00163ec09859:6 被我們purge了
            Executed_Gtid_Set: 8a0d06d7-91cb-11e5-843b-00163ec09859:1-8
                Auto_Position: 1
1 row in set (0.00 sec)


ERROR: 
No query specified

(root@127.0.0.1) [test]> select count(*) from t1;
+----------+
| count(*) |
+----------+
|       56 |                                           -----只有56條,剛好記錄的那8條被purge了,資料丟失了!
+----------+
1 row in set (0.00 sec)



當線上出現1236的錯誤後可以這樣解決,但是解決完後要檢查一下主從資料的一致性,不然會留下隱患!可能後面會導致1023等一些列錯誤

各種導致1236的原因!:http://blog.itpub.net/22664653/viewspace-1714269/




來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29096438/viewspace-1808376/,如需轉載,請註明出處,否則將追究法律責任。

相關文章