MySQL 5.7傳統複製到GTID線上切換(一主一從)

aaron8219發表於2018-07-12

 

Preface

 
    Classic replication is commonly used in previous version of MySQL.It`s really tough in managing them when our replications get into failures.Many new features are also depend on GTID.So it`s urgent to use GTID replication as soon as possible.I`m gonna to demenstrate how to change classic replication to GTID replication online with two servers.Here we go.
 
Framework
Hostname IP/Port Identity OS Version MySQL Version GTID Mode Binlog Format
zlm2 192.168.1.101/3306 master CentOS 7.0 5.7.21 off row
zlm3 192.168.1.102/3306 slave CentOS 7.0 5.7.21 off row
 
Procedure
 
Check  parameter “gtid_mode” and is “OFF” on both master and slave in the replication group.
 1 //Master
 2 (root@localhost mysql3306.sock)[(none)]>show variables like `gtid_mode`;
 3 +---------------+-------+
 4 | Variable_name | Value |
 5 +---------------+-------+
 6 | gtid_mode     | OFF   |
 7 +---------------+-------+
 8 1 row in set (0.01 sec
 9 
10 //Slave
11 (root@localhost mysql3306.sock)[(none)]>show variables like `gtid_mode`;
12 +---------------+-------+
13 | Variable_name | Value |
14 +---------------+-------+
15 | gtid_mode     | OFF   |
16 +---------------+-------+
17 1 row in set (0.00 sec)

 

Execute sysbench to generate some transactions continuously on master.
 1 [root@zlm2 07:22:53 ~/sysbench-1.0/src/lua]
 2 #sysbench oltp_insert.lua --mysql-host=192.168.1.101 --mysql-port=3306 --mysql-user=zlm --mysql-password=zlmzlm --mysql-db=sysbench --tables=10 --table-size=100000 --mysql-storage-engine=innodb prepare
 3 sysbench 1.0.15 (using bundled LuaJIT 2.1.0-beta2)
 4 
 5 Creating table `sbtest1`...
 6 Inserting 100000 records into `sbtest1`
 7 Creating a secondary index on `sbtest1`...
 8 Creating table `sbtest2`...
 9 Inserting 100000 records into `sbtest2`
10 Creating a secondary index on `sbtest2`...
11 ...
12 
13 [root@zlm2 07:26:30 ~/sysbench-1.0/src/lua]
14 #sysbench oltp_insert.lua --mysql-host=192.168.1.101 --mysql-port=3306 --mysql-user=zlm --mysql-password=zlmzlm --mysql-db=sysbench --threads=3 --time=7200 --report-interval=60 --rand-type=uniform run
15 sysbench 1.0.15 (using bundled LuaJIT 2.1.0-beta2)
16 
17 Running the test with following options:
18 Number of threads: 3
19 Report intermediate results every 60 second(s)
20 Initializing random number generator from current time
21 
22 
23 Initializing worker threads...
24 
25 Threads started!
26 
27 [ 60s ] thds: 3 tps: 1623.71 qps: 1623.71 (r/w/o: 0.00/1623.71/0.00) lat (ms,95%): 2.97 err/s: 0.00 reconn/s: 0.00
28 [ 120s ] thds: 3 tps: 1844.96 qps: 1844.96 (r/w/o: 0.00/1844.96/0.00) lat (ms,95%): 2.61 err/s: 0.00 reconn/s: 0.00
29 [ 180s ] thds: 3 tps: 1894.37 qps: 1894.37 (r/w/o: 0.00/1894.37/0.00) lat (ms,95%): 2.61 err/s: 0.00 reconn/s: 0.00
30 ...
31 
32 //Check the output of processlist.
33 (root@localhost mysql3306.sock)[(none)]>show processlist;
34 +----+------+------------+----------+-------------+------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
35 | Id | User | Host       | db       | Command     | Time | State                                                         | Info                                                                                                 |
36 +----+------+------------+----------+-------------+------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
37 | 41 | root | localhost  | NULL     | Query       |    0 | starting                                                      | show processlist                                                                                     |
38 | 43 | repl | zlm3:44252 | NULL     | Binlog Dump |  379 | Master has sent all binlog to slave; waiting for more updates | NULL                                                                                                 |
39 | 44 | zlm  | zlm2:56708 | sysbench | Query       |    0 | update                                                        | INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 8106, `57837919367-24452778030-14591605115-8049012633 |
40 | 45 | zlm  | zlm2:56709 | sysbench | Query       |    0 | update                                                        | INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 5602, `45087463438-93604980565-67881991526-9944080034 |
41 | 46 | zlm  | zlm2:56710 | sysbench | Query       |    0 | update                                                        | INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 3497, `01822437471-94427682076-39418270545-9867829936 |
42 +----+------+------------+----------+-------------+------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
43 5 rows in set (0.00 sec)

 

Make sure that the classic replication is working normally on slave.
 1 (root@localhost mysql3306.sock)[(none)]>show slave statusG
 2 *************************** 1. row ***************************
 3                Slave_IO_State: Waiting for master to send event
 4                   Master_Host: 192.168.1.101
 5                   Master_User: repl
 6                   Master_Port: 3306
 7                 Connect_Retry: 60
 8               Master_Log_File: mysql-bin.000006
 9           Read_Master_Log_Pos: 191183208
10                Relay_Log_File: relay-bin.000023
11                 Relay_Log_Pos: 41556833
12         Relay_Master_Log_File: mysql-bin.000006
13              Slave_IO_Running: Yes
14             Slave_SQL_Running: Yes
15               Replicate_Do_DB: 
16           Replicate_Ignore_DB: 
17            Replicate_Do_Table: 
18        Replicate_Ignore_Table: 
19       Replicate_Wild_Do_Table: 
20   Replicate_Wild_Ignore_Table: 
21                    Last_Errno: 0
22                    Last_Error: 
23                  Skip_Counter: 0
24           Exec_Master_Log_Pos: 175774368
25               Relay_Log_Space: 191183725
26               Until_Condition: None
27                Until_Log_File: 
28                 Until_Log_Pos: 0
29            Master_SSL_Allowed: No
30            Master_SSL_CA_File: 
31            Master_SSL_CA_Path: 
32               Master_SSL_Cert: 
33             Master_SSL_Cipher: 
34                Master_SSL_Key: 
35         Seconds_Behind_Master: 20
36 Master_SSL_Verify_Server_Cert: No
37                 Last_IO_Errno: 0
38                 Last_IO_Error: 
39                Last_SQL_Errno: 0
40                Last_SQL_Error: 
41   Replicate_Ignore_Server_Ids: 
42              Master_Server_Id: 1013306
43                   Master_UUID: 1b7181ee-6eaf-11e8-998e-080027de0e0e
44              Master_Info_File: mysql.slave_master_info
45                     SQL_Delay: 0
46           SQL_Remaining_Delay: NULL
47       Slave_SQL_Running_State: System lock
48            Master_Retry_Count: 86400
49                   Master_Bind: 
50       Last_IO_Error_Timestamp: 
51      Last_SQL_Error_Timestamp: 
52                Master_SSL_Crl: 
53            Master_SSL_Crlpath: 
54            Retrieved_Gtid_Set: 
55             Executed_Gtid_Set: 
56                 Auto_Position: 0 //This means we are using the classic replication now.
57          Replicate_Rewrite_DB: 
58                  Channel_Name: 
59            Master_TLS_Version: 
60 1 row in set (0.00 sec)

 

Change the parameter “enforce_gitd_consistency” to “warn” on both master and slave.
 1 //Master
 2 (root@localhost mysql3306.sock)[(none)]>set @@global.enforce_gtid_consistency=warn;
 3 Query OK, 0 rows affected (0.13 sec)
 4 
 5 (root@localhost mysql3306.sock)[(none)]>select @@global.enforce_gtid_consistency;
 6 +-----------------------------------+
 7 | @@global.enforce_gtid_consistency |
 8 +-----------------------------------+
 9 | WARN                              |
10 +-----------------------------------+
11 1 row in set (0.06 sec)
12 
13 //Error log of master
14 2018-07-13T07:37:56.877416+01:00 47 [Note] Changed ENFORCE_GTID_CONSISTENCY from OFF to WARN.
15 2018-07-13T07:39:15.748645+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 8825ms. The settings might not be optimal. (flushed=2001 and evicted=0, during the time.)
16 
17 //Slave
18 (root@localhost mysql3306.sock)[(none)]>set @@global.enforce_gtid_consistency=warn;
19 Query OK, 0 rows affected (0.49 sec)
20 
21 (root@localhost mysql3306.sock)[(none)]>select @@global.enforce_gtid_consistency;
22 +-----------------------------------+
23 | @@global.enforce_gtid_consistency |
24 +-----------------------------------+
25 | WARN                              |
26 +-----------------------------------+
27 1 row in set (1.35 sec)
28 
29 //Error log of slave
30 2018-07-13T07:38:02.556232+01:00 27 [Note] Changed ENFORCE_GTID_CONSISTENCY from OFF to WARN.
31 
32 //Make sure there`s no warning messages on both master and slave.

 

Change the parameter “enforce_gitd_consistency” to “on” on both master and slave.
 1 //Master
 2 (root@localhost mysql3306.sock)[(none)]>set @@global.enforce_gtid_consistency=on;
 3 Query OK, 0 rows affected (0.00 sec)
 4 
 5 (root@localhost mysql3306.sock)[(none)]>select @@global.enforce_gtid_consistency;
 6 +-----------------------------------+
 7 | @@global.enforce_gtid_consistency |
 8 +-----------------------------------+
 9 | ON                                |
10 +-----------------------------------+
11 1 row in set (0.00 sec)
12 
13 //Slave
14 (root@localhost mysql3306.sock)[(none)]>set @@global.enforce_gtid_consistency=on;
15 Query OK, 0 rows affected (0.03 sec)
16 
17 (root@localhost mysql3306.sock)[(none)]>select @@global.enforce_gtid_consistency;
18 +-----------------------------------+
19 | @@global.enforce_gtid_consistency |
20 +-----------------------------------+
21 | ON                                |
22 +-----------------------------------+
23 1 row in set (0.00 sec)

 

Change  the parameter “gtid_mode” to “off_permissive”  on both master and slave.
 1 //Master
 2 (root@localhost mysql3306.sock)[(none)]>set @@globa.gtid_mode=off_permissive;
 3 Query OK, 0 rows affected (0.72 sec)
 4 
 5 (root@localhost mysql3306.sock)[(none)]>select @@global.gtid_mode;
 6 +--------------------+
 7 | @@global.gtid_mode |
 8 +--------------------+
 9 | OFF_PERMISSIVE     |
10 +--------------------+
11 1 row in set (0.01 sec)
12 
13 //Error log of master
14 2018-07-13T07:37:56.877416+01:00 47 [Note] Changed ENFORCE_GTID_CONSISTENCY from OFF to WARN.
15 2018-07-13T07:39:15.748645+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 8825ms. The settings might not be optimal. (flushed=2001 and evicted=0, during the time.)
16 2018-07-13T07:42:38.472436+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 8569ms. The settings might not be optimal. (flushed=2001 and evicted=0, during the time.)
17 2018-07-13T07:44:03.886312+01:00 47 [Note] Changed ENFORCE_GTID_CONSISTENCY from WARN to ON.
18 2018-07-13T07:48:04.137251+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5067ms. The settings might not be optimal. (flushed=713 and evicted=0, during the time.)
19 2018-07-13T07:48:39.586306+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5394ms. The settings might not be optimal. (flushed=704 and evicted=0, during the time.)
20 2018-07-13T07:49:38.441594+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4927ms. The settings might not be optimal. (flushed=709 and evicted=0, during the time.)
21 2018-07-13T07:50:19.070954+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4539ms. The settings might not be optimal. (flushed=721 and evicted=0, during the time.)
22 2018-07-13T07:50:20.930564+01:00 47 [Note] Changed GTID_MODE from OFF to OFF_PERMISSIVE.
23 2018-07-13T07:50:36.490470+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4602ms. The settings might not be optimal. (flushed=705 and evicted=0, during the time.)
24 
25 
26 //Slave
27 (root@localhost mysql3306.sock)[(none)]>set @@global.gtid_mode=off_permissive;
28 Query OK, 0 rows affected (3.02 sec)
29 
30 (root@localhost mysql3306.sock)[(none)]>select @@global.gtid_mode;
31 +--------------------+
32 | @@global.gtid_mode |
33 +--------------------+
34 | OFF_PERMISSIVE     |
35 +--------------------+
36 1 row in set (0.00 sec)
37 
38 //Error log of slave
39 2018-07-13T07:38:02.556232+01:00 27 [Note] Changed ENFORCE_GTID_CONSISTENCY from OFF to WARN.
40 2018-07-13T07:44:22.628014+01:00 27 [Note] Changed ENFORCE_GTID_CONSISTENCY from WARN to ON.
41 2018-07-13T07:49:33.136288+01:00 27 [Note] Aborted connection 27 to db: `unconnected` user: `root` host: `localhost` (Got timeout reading communication packets)
42 2018-07-13T07:50:27.360767+01:00 28 [Note] Changed GTID_MODE from OFF to OFF_PERMISSIVE.
43 2018-07-13T07:50:39.972826+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 10489ms. The settings might not be optimal. (flushed=2001 and evicted=0, during the time.)

 

Change  the parameter “gtid_mode” to “on_permissive”  on both master and slave.
 1 //Master
 2 (root@localhost mysql3306.sock)[(none)]>set @@global.gtid_mode=on_permissive;
 3 Query OK, 0 rows affected (3.26 sec)
 4 
 5 (root@localhost mysql3306.sock)[(none)]>select @@global.gtid_mode;
 6 +--------------------+
 7 | @@global.gtid_mode |
 8 +--------------------+
 9 | ON_PERMISSIVE      |
10 +--------------------+
11 1 row in set (0.00 sec)
12 
13 //Error log of master
14 2018-07-13T07:57:16.796632+01:00 48 [Note] Changed GTID_MODE from OFF_PERMISSIVE to ON_PERMISSIVE.
15 2018-07-13T07:57:20.034425+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4954ms. The settings might not be optimal. (flushed=752 and evicted=0, during the time.)
16 
17 //Slave
18 (root@localhost mysql3306.sock)[(none)]>set @@global.gtid_mode=on_permissive;
19 Query OK, 0 rows affected (2.22 sec)
20 
21 (root@localhost mysql3306.sock)[(none)]>select @@global.gtid_mode;
22 +--------------------+
23 | @@global.gtid_mode |
24 +--------------------+
25 | ON_PERMISSIVE      |
26 +--------------------+
27 1 row in set (0.06 sec)
28 
29 //Error log of slave
30 2018-07-13T07:56:57.921081+01:00 29 [Note] Changed GTID_MODE from OFF_PERMISSIVE to ON_PERMISSIVE.
31 2018-07-13T07:57:03.109628+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5853ms. The settings might not be optimal. (flushed=733 and evicted=0, during the time.)
32 
33 //I`m afraid it`s better to execut "set gtid_mode=on_permissive;" on slave first for best practice even though sometimes it`s not obliged to do that.

 

Make sure all the binlogs generated by classic replication has been disappeared on both master and slave by checking parameter `ongoing_anonymous_transaction_count` whether it returns “0”.
 1 //Master
 2 (root@localhost mysql3306.sock)[(none)]>show status like `ongoing_anonymous_transaction_count`;
 3 +-------------------------------------+-------+
 4 | Variable_name                       | Value |
 5 +-------------------------------------+-------+
 6 | Ongoing_anonymous_transaction_count | 0     |
 7 +-------------------------------------+-------+
 8 1 row in set (0.66 sec)
 9 
10 //Slave
11 (root@localhost mysql3306.sock)[(none)]>show status like `ongoing_anonymous_transaction_count`;
12 +-------------------------------------+-------+
13 | Variable_name                       | Value |
14 +-------------------------------------+-------+
15 | Ongoing_anonymous_transaction_count | 0     |
16 +-------------------------------------+-------+
17 1 row in set (3.34 sec)
18 
19 //The value of `ongoing_anonymous_transaction_count` become "0" what means there arn`t non-gtid events in binlogs anymore.Therefore,we can do the last step,that is,to change the "gtid_mode" to "on".

 

Change the parameter “gtid_mode” to “on” on both master and slave.
 1 //Master
 2 (root@localhost mysql3306.sock)[(none)]>select @@global.gtid_mode;
 3 +--------------------+
 4 | @@global.gtid_mode |
 5 +--------------------+
 6 | ON                 |
 7 +--------------------+
 8 1 row in set (0.00 sec)
 9 
10 //Error log of master
11 2018-07-13T08:20:59.853460+01:00 50 [Note] Changed GTID_MODE from ON_PERMISSIVE to ON.
12 2018-07-13T08:21:01.804678+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6035ms. The settings might not be optimal. (flushed=745 and evicted=0, during the time.)
13 2018-07-13T08:21:56.202081+01:00 43 [Note] Aborted connection 43 to db: `unconnected` user: `repl` host: `zlm3` (Failed on my_net_write())
14 
15 //Slave
16 (root@localhost mysql3306.sock)[(none)]>set @@global.gtid_mode=on;
17 ERROR 2006 (HY000): MySQL server has gone away
18 No connection. Trying to reconnect...
19 Connection id:    31
20 Current database: *** NONE ***
21 
22 //It`s stuck here.Oh my!!!
23 
24 //Check the error log of slave see what has happened.
25 2018-07-13T08:20:49.070915+01:00 25 [ERROR] Disk is full writing `./relay-bin.000044` (Errcode: 16026912 - No space left on device). Waiting for someone to free space...
26 2018-07-13T08:20:49.070948+01:00 25 [ERROR] Retry in 60 secs. Message reprinted in 600 secs
27 2018-07-13T08:20:49.104353+01:00 26 [ERROR] Disk is full writing `/data/mysql/mysql3306/logs/mysql-bin.000011` (Errcode: 16026912 - No space left on device). Waiting for someone to free space...
28 2018-07-13T08:20:49.104382+01:00 26 [ERROR] Retry in 60 secs. Message reprinted in 600 secs
29 2018-07-13T08:20:51.712891+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4001ms. The settings might not be optimal. (flushed=742 and evicted=0, during the time.)
30 2018-07-13T08:21:00.346384+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 7634ms. The settings might not be optimal. (flushed=2000 and evicted=0, during the time.)
31 
32 //It shows "[ERROR] Disk is full writing ... ".The test tables have been inserted too many data.
33 
34 [root@zlm3 08:08:06 ~]
35 #df -h
36 Filesystem               Size  Used Avail Use% Mounted on
37 /dev/mapper/centos-root  8.4G  8.4G   20K 100% /  //The root directory is full.
38 devtmpfs                 488M     0  488M   0% /dev
39 tmpfs                    497M     0  497M   0% /dev/shm
40 tmpfs                    497M  6.6M  491M   2% /run
41 tmpfs                    497M     0  497M   0% /sys/fs/cgroup
42 /dev/sda1                497M  118M  379M  24% /boot
43 none                      87G   80G  7.1G  92% /vagrant
44 
45 //Unfortunately,the disk on salve has been writen fully.

 

    Notwithstanding the demonstrating was interupted accidentally but the porcedure of changing classic replication to GTID replicatioin is correct.Onlyif  the slave has finished to change the “gtid_mode” to “on”,the implementing is accomplished.
    One more thing need to do is to modify your “my.cnf” file to make them support GTID replication after restarting your mysqld process.Make sure these three parameters:”enforce_gtid_consistency=on”,”gtid_mode=on”,”log_slave_updates=on” are right in your configuration file “my.cnf”.
    The last thing to do in this case is to stop slave,set “master_auto_position=1”  and start slave again.I`m not going to do these last steps here(`cause the environment has been destroyed.oops!).
 
Some error masseages may occur if you don`t implement follow the sequence above.
1 //The output of "show salve statusG"
2 Last_IO_Errno: 1593
3                 Last_IO_Error: The replication receiver thread cannot start because the master has GTID_MODE = ON and this server has GTID_MODE = OFF
4                 
5 //You cannot modify "gtid_mode" to "on" directly.
6 (root@localhost mysql3306.sock)[(none)]>set @@global.gtid_mode=on;
7 ERROR 1788 (HY000): The value of @@GLOBAL.GTID_MODE can only be changed one step at a time: OFF <-> OFF_PERMISSIVE <-> ON_PERMISSIVE <-> ON. Also note that this value must be stepped up or down simultaneously on all servers. See the Manual for instructions.

 

Summary
  • GTID replication is the best practice in MySQL replicaiton now,especially in 5.7 version above.More and more new good features are relies on GTID,such as “Group Replication”,”Group Commit”,”Parallel Replication”,etc.
  • We`d better replace all the classic replication to GTID replication in our product environment in order to get more benifits and work efficiently.
  • Chang classic replicaiton to GTID replicaiton online should follow the order of “off -> off_permissive -> on_permissive -> on” and execute them on both master and slaves.
  • Notice that change online is only support on MySQL 5.7.6 and above.
 

相關文章