Preface
Classic replication is commonly used in previous version of MySQL.It`s really tough in managing them when our replications get into failures.Many new features are also depend on GTID.So it`s urgent to use GTID replication as soon as possible.I`m gonna to demenstrate how to change classic replication to GTID replication online with two servers.Here we go.
Framework
Hostname | IP/Port | Identity | OS Version | MySQL Version | GTID Mode | Binlog Format |
zlm2 | 192.168.1.101/3306 | master | CentOS 7.0 | 5.7.21 | off | row |
zlm3 | 192.168.1.102/3306 | slave | CentOS 7.0 | 5.7.21 | off | row |
Procedure
Check parameter “gtid_mode” and is “OFF” on both master and slave in the replication group.
1 //Master 2 (root@localhost mysql3306.sock)[(none)]>show variables like `gtid_mode`; 3 +---------------+-------+ 4 | Variable_name | Value | 5 +---------------+-------+ 6 | gtid_mode | OFF | 7 +---------------+-------+ 8 1 row in set (0.01 sec 9 10 //Slave 11 (root@localhost mysql3306.sock)[(none)]>show variables like `gtid_mode`; 12 +---------------+-------+ 13 | Variable_name | Value | 14 +---------------+-------+ 15 | gtid_mode | OFF | 16 +---------------+-------+ 17 1 row in set (0.00 sec)
Execute sysbench to generate some transactions continuously on master.
1 [root@zlm2 07:22:53 ~/sysbench-1.0/src/lua] 2 #sysbench oltp_insert.lua --mysql-host=192.168.1.101 --mysql-port=3306 --mysql-user=zlm --mysql-password=zlmzlm --mysql-db=sysbench --tables=10 --table-size=100000 --mysql-storage-engine=innodb prepare 3 sysbench 1.0.15 (using bundled LuaJIT 2.1.0-beta2) 4 5 Creating table `sbtest1`... 6 Inserting 100000 records into `sbtest1` 7 Creating a secondary index on `sbtest1`... 8 Creating table `sbtest2`... 9 Inserting 100000 records into `sbtest2` 10 Creating a secondary index on `sbtest2`... 11 ... 12 13 [root@zlm2 07:26:30 ~/sysbench-1.0/src/lua] 14 #sysbench oltp_insert.lua --mysql-host=192.168.1.101 --mysql-port=3306 --mysql-user=zlm --mysql-password=zlmzlm --mysql-db=sysbench --threads=3 --time=7200 --report-interval=60 --rand-type=uniform run 15 sysbench 1.0.15 (using bundled LuaJIT 2.1.0-beta2) 16 17 Running the test with following options: 18 Number of threads: 3 19 Report intermediate results every 60 second(s) 20 Initializing random number generator from current time 21 22 23 Initializing worker threads... 24 25 Threads started! 26 27 [ 60s ] thds: 3 tps: 1623.71 qps: 1623.71 (r/w/o: 0.00/1623.71/0.00) lat (ms,95%): 2.97 err/s: 0.00 reconn/s: 0.00 28 [ 120s ] thds: 3 tps: 1844.96 qps: 1844.96 (r/w/o: 0.00/1844.96/0.00) lat (ms,95%): 2.61 err/s: 0.00 reconn/s: 0.00 29 [ 180s ] thds: 3 tps: 1894.37 qps: 1894.37 (r/w/o: 0.00/1894.37/0.00) lat (ms,95%): 2.61 err/s: 0.00 reconn/s: 0.00 30 ... 31 32 //Check the output of processlist. 33 (root@localhost mysql3306.sock)[(none)]>show processlist; 34 +----+------+------------+----------+-------------+------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+ 35 | Id | User | Host | db | Command | Time | State | Info | 36 +----+------+------------+----------+-------------+------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+ 37 | 41 | root | localhost | NULL | Query | 0 | starting | show processlist | 38 | 43 | repl | zlm3:44252 | NULL | Binlog Dump | 379 | Master has sent all binlog to slave; waiting for more updates | NULL | 39 | 44 | zlm | zlm2:56708 | sysbench | Query | 0 | update | INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 8106, `57837919367-24452778030-14591605115-8049012633 | 40 | 45 | zlm | zlm2:56709 | sysbench | Query | 0 | update | INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 5602, `45087463438-93604980565-67881991526-9944080034 | 41 | 46 | zlm | zlm2:56710 | sysbench | Query | 0 | update | INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 3497, `01822437471-94427682076-39418270545-9867829936 | 42 +----+------+------------+----------+-------------+------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+ 43 5 rows in set (0.00 sec)
Make sure that the classic replication is working normally on slave.
1 (root@localhost mysql3306.sock)[(none)]>show slave statusG 2 *************************** 1. row *************************** 3 Slave_IO_State: Waiting for master to send event 4 Master_Host: 192.168.1.101 5 Master_User: repl 6 Master_Port: 3306 7 Connect_Retry: 60 8 Master_Log_File: mysql-bin.000006 9 Read_Master_Log_Pos: 191183208 10 Relay_Log_File: relay-bin.000023 11 Relay_Log_Pos: 41556833 12 Relay_Master_Log_File: mysql-bin.000006 13 Slave_IO_Running: Yes 14 Slave_SQL_Running: Yes 15 Replicate_Do_DB: 16 Replicate_Ignore_DB: 17 Replicate_Do_Table: 18 Replicate_Ignore_Table: 19 Replicate_Wild_Do_Table: 20 Replicate_Wild_Ignore_Table: 21 Last_Errno: 0 22 Last_Error: 23 Skip_Counter: 0 24 Exec_Master_Log_Pos: 175774368 25 Relay_Log_Space: 191183725 26 Until_Condition: None 27 Until_Log_File: 28 Until_Log_Pos: 0 29 Master_SSL_Allowed: No 30 Master_SSL_CA_File: 31 Master_SSL_CA_Path: 32 Master_SSL_Cert: 33 Master_SSL_Cipher: 34 Master_SSL_Key: 35 Seconds_Behind_Master: 20 36 Master_SSL_Verify_Server_Cert: No 37 Last_IO_Errno: 0 38 Last_IO_Error: 39 Last_SQL_Errno: 0 40 Last_SQL_Error: 41 Replicate_Ignore_Server_Ids: 42 Master_Server_Id: 1013306 43 Master_UUID: 1b7181ee-6eaf-11e8-998e-080027de0e0e 44 Master_Info_File: mysql.slave_master_info 45 SQL_Delay: 0 46 SQL_Remaining_Delay: NULL 47 Slave_SQL_Running_State: System lock 48 Master_Retry_Count: 86400 49 Master_Bind: 50 Last_IO_Error_Timestamp: 51 Last_SQL_Error_Timestamp: 52 Master_SSL_Crl: 53 Master_SSL_Crlpath: 54 Retrieved_Gtid_Set: 55 Executed_Gtid_Set: 56 Auto_Position: 0 //This means we are using the classic replication now. 57 Replicate_Rewrite_DB: 58 Channel_Name: 59 Master_TLS_Version: 60 1 row in set (0.00 sec)
Change the parameter “enforce_gitd_consistency” to “warn” on both master and slave.
1 //Master 2 (root@localhost mysql3306.sock)[(none)]>set @@global.enforce_gtid_consistency=warn; 3 Query OK, 0 rows affected (0.13 sec) 4 5 (root@localhost mysql3306.sock)[(none)]>select @@global.enforce_gtid_consistency; 6 +-----------------------------------+ 7 | @@global.enforce_gtid_consistency | 8 +-----------------------------------+ 9 | WARN | 10 +-----------------------------------+ 11 1 row in set (0.06 sec) 12 13 //Error log of master 14 2018-07-13T07:37:56.877416+01:00 47 [Note] Changed ENFORCE_GTID_CONSISTENCY from OFF to WARN. 15 2018-07-13T07:39:15.748645+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 8825ms. The settings might not be optimal. (flushed=2001 and evicted=0, during the time.) 16 17 //Slave 18 (root@localhost mysql3306.sock)[(none)]>set @@global.enforce_gtid_consistency=warn; 19 Query OK, 0 rows affected (0.49 sec) 20 21 (root@localhost mysql3306.sock)[(none)]>select @@global.enforce_gtid_consistency; 22 +-----------------------------------+ 23 | @@global.enforce_gtid_consistency | 24 +-----------------------------------+ 25 | WARN | 26 +-----------------------------------+ 27 1 row in set (1.35 sec) 28 29 //Error log of slave 30 2018-07-13T07:38:02.556232+01:00 27 [Note] Changed ENFORCE_GTID_CONSISTENCY from OFF to WARN. 31 32 //Make sure there`s no warning messages on both master and slave.
Change the parameter “enforce_gitd_consistency” to “on” on both master and slave.
1 //Master 2 (root@localhost mysql3306.sock)[(none)]>set @@global.enforce_gtid_consistency=on; 3 Query OK, 0 rows affected (0.00 sec) 4 5 (root@localhost mysql3306.sock)[(none)]>select @@global.enforce_gtid_consistency; 6 +-----------------------------------+ 7 | @@global.enforce_gtid_consistency | 8 +-----------------------------------+ 9 | ON | 10 +-----------------------------------+ 11 1 row in set (0.00 sec) 12 13 //Slave 14 (root@localhost mysql3306.sock)[(none)]>set @@global.enforce_gtid_consistency=on; 15 Query OK, 0 rows affected (0.03 sec) 16 17 (root@localhost mysql3306.sock)[(none)]>select @@global.enforce_gtid_consistency; 18 +-----------------------------------+ 19 | @@global.enforce_gtid_consistency | 20 +-----------------------------------+ 21 | ON | 22 +-----------------------------------+ 23 1 row in set (0.00 sec)
Change the parameter “gtid_mode” to “off_permissive” on both master and slave.
1 //Master 2 (root@localhost mysql3306.sock)[(none)]>set @@globa.gtid_mode=off_permissive; 3 Query OK, 0 rows affected (0.72 sec) 4 5 (root@localhost mysql3306.sock)[(none)]>select @@global.gtid_mode; 6 +--------------------+ 7 | @@global.gtid_mode | 8 +--------------------+ 9 | OFF_PERMISSIVE | 10 +--------------------+ 11 1 row in set (0.01 sec) 12 13 //Error log of master 14 2018-07-13T07:37:56.877416+01:00 47 [Note] Changed ENFORCE_GTID_CONSISTENCY from OFF to WARN. 15 2018-07-13T07:39:15.748645+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 8825ms. The settings might not be optimal. (flushed=2001 and evicted=0, during the time.) 16 2018-07-13T07:42:38.472436+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 8569ms. The settings might not be optimal. (flushed=2001 and evicted=0, during the time.) 17 2018-07-13T07:44:03.886312+01:00 47 [Note] Changed ENFORCE_GTID_CONSISTENCY from WARN to ON. 18 2018-07-13T07:48:04.137251+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5067ms. The settings might not be optimal. (flushed=713 and evicted=0, during the time.) 19 2018-07-13T07:48:39.586306+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5394ms. The settings might not be optimal. (flushed=704 and evicted=0, during the time.) 20 2018-07-13T07:49:38.441594+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4927ms. The settings might not be optimal. (flushed=709 and evicted=0, during the time.) 21 2018-07-13T07:50:19.070954+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4539ms. The settings might not be optimal. (flushed=721 and evicted=0, during the time.) 22 2018-07-13T07:50:20.930564+01:00 47 [Note] Changed GTID_MODE from OFF to OFF_PERMISSIVE. 23 2018-07-13T07:50:36.490470+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4602ms. The settings might not be optimal. (flushed=705 and evicted=0, during the time.) 24 25 26 //Slave 27 (root@localhost mysql3306.sock)[(none)]>set @@global.gtid_mode=off_permissive; 28 Query OK, 0 rows affected (3.02 sec) 29 30 (root@localhost mysql3306.sock)[(none)]>select @@global.gtid_mode; 31 +--------------------+ 32 | @@global.gtid_mode | 33 +--------------------+ 34 | OFF_PERMISSIVE | 35 +--------------------+ 36 1 row in set (0.00 sec) 37 38 //Error log of slave 39 2018-07-13T07:38:02.556232+01:00 27 [Note] Changed ENFORCE_GTID_CONSISTENCY from OFF to WARN. 40 2018-07-13T07:44:22.628014+01:00 27 [Note] Changed ENFORCE_GTID_CONSISTENCY from WARN to ON. 41 2018-07-13T07:49:33.136288+01:00 27 [Note] Aborted connection 27 to db: `unconnected` user: `root` host: `localhost` (Got timeout reading communication packets) 42 2018-07-13T07:50:27.360767+01:00 28 [Note] Changed GTID_MODE from OFF to OFF_PERMISSIVE. 43 2018-07-13T07:50:39.972826+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 10489ms. The settings might not be optimal. (flushed=2001 and evicted=0, during the time.)
Change the parameter “gtid_mode” to “on_permissive” on both master and slave.
1 //Master 2 (root@localhost mysql3306.sock)[(none)]>set @@global.gtid_mode=on_permissive; 3 Query OK, 0 rows affected (3.26 sec) 4 5 (root@localhost mysql3306.sock)[(none)]>select @@global.gtid_mode; 6 +--------------------+ 7 | @@global.gtid_mode | 8 +--------------------+ 9 | ON_PERMISSIVE | 10 +--------------------+ 11 1 row in set (0.00 sec) 12 13 //Error log of master 14 2018-07-13T07:57:16.796632+01:00 48 [Note] Changed GTID_MODE from OFF_PERMISSIVE to ON_PERMISSIVE. 15 2018-07-13T07:57:20.034425+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4954ms. The settings might not be optimal. (flushed=752 and evicted=0, during the time.) 16 17 //Slave 18 (root@localhost mysql3306.sock)[(none)]>set @@global.gtid_mode=on_permissive; 19 Query OK, 0 rows affected (2.22 sec) 20 21 (root@localhost mysql3306.sock)[(none)]>select @@global.gtid_mode; 22 +--------------------+ 23 | @@global.gtid_mode | 24 +--------------------+ 25 | ON_PERMISSIVE | 26 +--------------------+ 27 1 row in set (0.06 sec) 28 29 //Error log of slave 30 2018-07-13T07:56:57.921081+01:00 29 [Note] Changed GTID_MODE from OFF_PERMISSIVE to ON_PERMISSIVE. 31 2018-07-13T07:57:03.109628+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5853ms. The settings might not be optimal. (flushed=733 and evicted=0, during the time.) 32 33 //I`m afraid it`s better to execut "set gtid_mode=on_permissive;" on slave first for best practice even though sometimes it`s not obliged to do that.
Make sure all the binlogs generated by classic replication has been disappeared on both master and slave by checking parameter `ongoing_anonymous_transaction_count` whether it returns “0”.
1 //Master 2 (root@localhost mysql3306.sock)[(none)]>show status like `ongoing_anonymous_transaction_count`; 3 +-------------------------------------+-------+ 4 | Variable_name | Value | 5 +-------------------------------------+-------+ 6 | Ongoing_anonymous_transaction_count | 0 | 7 +-------------------------------------+-------+ 8 1 row in set (0.66 sec) 9 10 //Slave 11 (root@localhost mysql3306.sock)[(none)]>show status like `ongoing_anonymous_transaction_count`; 12 +-------------------------------------+-------+ 13 | Variable_name | Value | 14 +-------------------------------------+-------+ 15 | Ongoing_anonymous_transaction_count | 0 | 16 +-------------------------------------+-------+ 17 1 row in set (3.34 sec) 18 19 //The value of `ongoing_anonymous_transaction_count` become "0" what means there arn`t non-gtid events in binlogs anymore.Therefore,we can do the last step,that is,to change the "gtid_mode" to "on".
Change the parameter “gtid_mode” to “on” on both master and slave.
1 //Master 2 (root@localhost mysql3306.sock)[(none)]>select @@global.gtid_mode; 3 +--------------------+ 4 | @@global.gtid_mode | 5 +--------------------+ 6 | ON | 7 +--------------------+ 8 1 row in set (0.00 sec) 9 10 //Error log of master 11 2018-07-13T08:20:59.853460+01:00 50 [Note] Changed GTID_MODE from ON_PERMISSIVE to ON. 12 2018-07-13T08:21:01.804678+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6035ms. The settings might not be optimal. (flushed=745 and evicted=0, during the time.) 13 2018-07-13T08:21:56.202081+01:00 43 [Note] Aborted connection 43 to db: `unconnected` user: `repl` host: `zlm3` (Failed on my_net_write()) 14 15 //Slave 16 (root@localhost mysql3306.sock)[(none)]>set @@global.gtid_mode=on; 17 ERROR 2006 (HY000): MySQL server has gone away 18 No connection. Trying to reconnect... 19 Connection id: 31 20 Current database: *** NONE *** 21 22 //It`s stuck here.Oh my!!! 23 24 //Check the error log of slave see what has happened. 25 2018-07-13T08:20:49.070915+01:00 25 [ERROR] Disk is full writing `./relay-bin.000044` (Errcode: 16026912 - No space left on device). Waiting for someone to free space... 26 2018-07-13T08:20:49.070948+01:00 25 [ERROR] Retry in 60 secs. Message reprinted in 600 secs 27 2018-07-13T08:20:49.104353+01:00 26 [ERROR] Disk is full writing `/data/mysql/mysql3306/logs/mysql-bin.000011` (Errcode: 16026912 - No space left on device). Waiting for someone to free space... 28 2018-07-13T08:20:49.104382+01:00 26 [ERROR] Retry in 60 secs. Message reprinted in 600 secs 29 2018-07-13T08:20:51.712891+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4001ms. The settings might not be optimal. (flushed=742 and evicted=0, during the time.) 30 2018-07-13T08:21:00.346384+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 7634ms. The settings might not be optimal. (flushed=2000 and evicted=0, during the time.) 31 32 //It shows "[ERROR] Disk is full writing ... ".The test tables have been inserted too many data. 33 34 [root@zlm3 08:08:06 ~] 35 #df -h 36 Filesystem Size Used Avail Use% Mounted on 37 /dev/mapper/centos-root 8.4G 8.4G 20K 100% / //The root directory is full. 38 devtmpfs 488M 0 488M 0% /dev 39 tmpfs 497M 0 497M 0% /dev/shm 40 tmpfs 497M 6.6M 491M 2% /run 41 tmpfs 497M 0 497M 0% /sys/fs/cgroup 42 /dev/sda1 497M 118M 379M 24% /boot 43 none 87G 80G 7.1G 92% /vagrant 44 45 //Unfortunately,the disk on salve has been writen fully.
Notwithstanding the demonstrating was interupted accidentally but the porcedure of changing classic replication to GTID replicatioin is correct.Onlyif the slave has finished to change the “gtid_mode” to “on”,the implementing is accomplished.
One more thing need to do is to modify your “my.cnf” file to make them support GTID replication after restarting your mysqld process.Make sure these three parameters:”enforce_gtid_consistency=on”,”gtid_mode=on”,”log_slave_updates=on” are right in your configuration file “my.cnf”.
The last thing to do in this case is to stop slave,set “master_auto_position=1” and start slave again.I`m not going to do these last steps here(`cause the environment has been destroyed.oops!).
Some error masseages may occur if you don`t implement follow the sequence above.
1 //The output of "show salve statusG" 2 Last_IO_Errno: 1593 3 Last_IO_Error: The replication receiver thread cannot start because the master has GTID_MODE = ON and this server has GTID_MODE = OFF 4 5 //You cannot modify "gtid_mode" to "on" directly. 6 (root@localhost mysql3306.sock)[(none)]>set @@global.gtid_mode=on; 7 ERROR 1788 (HY000): The value of @@GLOBAL.GTID_MODE can only be changed one step at a time: OFF <-> OFF_PERMISSIVE <-> ON_PERMISSIVE <-> ON. Also note that this value must be stepped up or down simultaneously on all servers. See the Manual for instructions.
Summary
- GTID replication is the best practice in MySQL replicaiton now,especially in 5.7 version above.More and more new good features are relies on GTID,such as “Group Replication”,”Group Commit”,”Parallel Replication”,etc.
- We`d better replace all the classic replication to GTID replication in our product environment in order to get more benifits and work efficiently.
- Chang classic replicaiton to GTID replicaiton online should follow the order of “off -> off_permissive -> on_permissive -> on” and execute them on both master and slaves.
- Notice that change online is only support on MySQL 5.7.6 and above.