MySQL 8.0 Reference Manual（讀書筆記91節--Replication(2)）

1.Seconds_Behind_Source

The SHOW REPLICA STATUS statement, which you must execute on each replica, provides information about the configuration and status of the connection between the replica server and the source server. From MySQL 8.0.22, SHOW SLAVE STATUS is deprecated, and SHOW REPLICA STATUS is available to use instead. The Performance Schema has replication tables that provide this information in a more accessible form.

Seconds_Behind_Source: The number of seconds that the replication SQL (applier) thread is behind processing the source binary log. A high number (or an increasing one) can indicate that the replica is unable to handle events from the source in a timely fashion.

A value of 0 for Seconds_Behind_Source can usually be interpreted as meaning that the replica has caught up with the source, but there are some cases where this is not strictly true. For example, this can occur if the network connection between source and replica is broken but the replication I/O (receiver) thread has not yet noticed this; that is, the time period set by replica_net_timeout or slave_net_timeout has not yet elapsed【ɪˈlæpst (時間)消逝，流逝;】.

It is also possible that transient【ˈtrænʃnt 短暫的;臨時的;轉瞬即逝的;暫住的;過往的;倏忽;】 values for Seconds_Behind_Source may not reflect the situation accurately. When the replication SQL (applier) thread has caught up on I/O, Seconds_Behind_Source displays 0; but when the replication I/O (receiver) thread is still queuing up a new event, Seconds_Behind_Source may show a large value until the replication applier thread finishes executing the new event. This is especially likely when the events have old timestamps; in such cases, if you execute SHOW REPLICA STATUS several times in a relatively short period, you may see this value change back and forth repeatedly between 0 and a relatively large value.

2.透過SHOW PROCESSLIST檢視

On the source【主節點】, you can check the status of connected replicas using SHOW PROCESSLIST to examine the list of running processes. Replica connections have Binlog Dump in the Command field:

mysql> SHOW PROCESSLIST \G;
*************************** 4. row ***************************
 Id: 10
 User: root
 Host: replica1:58371
 db: NULL
Command: Binlog Dump
 Time: 777
 State: Has sent all binlog to slave; waiting for binlog to be updated
 Info: NULL

Because it is the replica that drives the replication process, very little information is available in this report.

3.Pausing Replication on the Replica

關閉

mysql> STOP SLAVE;
Or from MySQL 8.0.22:
mysql> STOP REPLICA;

You can pause the I/O (receiver) or SQL (applier) thread individually by specifying the thread type:

mysql> STOP SLAVE IO_THREAD;
mysql> STOP SLAVE SQL_THREAD;
Or from MySQL 8.0.22:
mysql> STOP REPLICA IO_THREAD;
mysql> STOP REPLICA SQL_THREAD;

開啟

mysql> START SLAVE;
Or from MySQL 8.0.22:
mysql> START REPLICA;

To start a particular thread, specify the thread type:

mysql> START SLAVE IO_THREAD;
mysql> START SLAVE SQL_THREAD;
Or from MySQL 8.0.22:
mysql> START REPLICA IO_THREAD;
mysql> START REPLICA SQL_THREAD;

4.Skipping Transactions的準備工作

If replication stops due to an issue with an event in a replicated transaction, you can resume replication by skipping the failed transaction on the replica. Before skipping a transaction, ensure that the replication I/O (receiver) thread is stopped as well as the SQL (applier) thread.

First you need to identify the replicated event that caused the error. Details of the error and the last successfully applied transaction are recorded in the Performance Schema table replication_applier_status_by_worker. You can use mysqlbinlog to retrieve and display the events that were logged around the time of the error.Alternatively, you can issue SHOW RELAYLOG EVENTS on the replica or SHOW BINLOG EVENTS on the source.

Before skipping the transaction and restarting the replica, check these points:

• Is the transaction that stopped replication from an unknown or untrusted source? If so, investigate the cause in case there are any security considerations that indicate the replica should not be restarted.

• Does the transaction that stopped replication need to be applied on the replica? If so, either make the appropriate corrections and reapply the transaction, or manually reconcile the data on the replica.

• Did the transaction that stopped replication need to be applied on the source? If not, undo the transaction manually on the server where it originally took place.

5.Replication Threads

MySQL replication capabilities are implemented using three main threads, one on the source server and two on the replica:

• Binary log dump thread. The source creates a thread to send the binary log contents to a replica when the replica connects. This thread can be identified in the output of SHOW PROCESSLIST on the source as the Binlog Dump thread.

The binary log dump thread acquires a lock on the source's binary log for reading each event that is to be sent to the replica. As soon as the event has been read, the lock is released, even before the event is sent to the replica.

• Replication I/O receiver thread. When a START REPLICA statement is issued on a replica server, the replica creates an I/O (receiver) thread, which connects to the source and asks it to send the updates recorded in its binary logs.

The replication receiver thread reads the updates that the source's Binlog Dump thread sends (see previous item) and copies them to local files that comprise the replica's relay log.

The state of this thread is shown as Slave_IO_running in the output of SHOW REPLICA STATUS.

• Replication SQL applier thread. The replica creates an SQL (applier) thread to read the relay log that is written by the replication receiver thread and execute the transactions contained in it.

There are three main threads for each connection between a source and a replica. A source that has multiple【ˈmʌltɪpl 數量多的;多種多樣的;】 replicas creates one binary log dump thread for each replica currently connected; each replica has its own replication receiver and applier threads.

A replica uses two threads to separate reading updates from the source and executing them into independent【ˌɪndɪˈpendənt 獨立的;自主的;公正的;自治的;自立的;私營的;不相關的;無黨派的;不受影響的;無關聯的;無偏見的;不相干的人所做的(或提供的);】 tasks. Thus, the task of reading transactions is not slowed down if the process of applying them is slow. For example, if the replica server has not been running for a while, its receiver thread can quickly fetch all the binary log contents from the source when the replica starts, even if the applier thread lags far behind. If the replica stops before the SQL thread has executed all the fetched statements, the receiver thread has at least fetched everything so that a safe copy of the transactions is stored locally in the replica's relay logs, ready for execution the next time that the replica starts.

You can enable further parallelization for tasks on a replica by setting the system variable replica_parallel_workers (MySQL 8.0.26 or later) or slave_parallel_workers (prior to MySQL 8.0.26) to a value greater than 0 (the default). When this system variable is set, the replica creates the specified number of worker threads to apply transactions, plus a coordinator thread to manage them. If you are using multiple replication channels, each channel has this number of threads. A replica with replica_parallel_workers or slave_parallel_workers set to a value greater than 0 is called a multithreaded replica. With this setup, transactions that fail can be retried.

6.Monitoring Replication Main Threads

The SHOW PROCESSLIST statement provides information that tells you what is happening on the source and on the replica regarding replication.

The following example illustrates【ˈɪləstreɪts (用示例、圖畫等)說明，解釋;顯示…存在;加插圖於;給(書等)做圖表;表明…真實;】 how the three main replication threads, the binary log dump thread, replication I/O (receiver) thread, and replication SQL (applier) thread, show up in the output from SHOW PROCESSLIST.

On the source server, the output from SHOW PROCESSLIST looks like this:

mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
 Id: 2
 User: root
 Host: localhost:32931
 db: NULL
Command: Binlog Dump
 Time: 94
 State: Has sent all binlog to slave; waiting for binlog to
 be updated
 Info: NULL

Here, thread 2 is a Binlog Dump thread that services a connected replica. The State information indicates that all outstanding updates have been sent to the replica and that the source is waiting for more updates to occur. If you see no Binlog Dump threads on a source server, this means that replication is not running; that is, no replicas are currently connected.

On a replica server, the output from SHOW PROCESSLIST looks like this:

mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
 Id: 10
 User: system user
 Host:
 db: NULL
Command: Connect
 Time: 11
 State: Waiting for master to send event
 Info: NULL
*************************** 2. row ***************************
 Id: 11
 User: system user
 Host:
 db: NULL
Command: Connect
 Time: 11
 State: Has read all relay log; waiting for the slave I/O
 thread to update it
 Info: NULL

The State information indicates that thread 10 is the replication I/O (receiver) thread that is communicating with the source server, and thread 11 is the replication SQL (applier) thread that is processing the updates stored in the relay logs. At the time that SHOW PROCESSLIST was run, both threads were idle, waiting for further updates.

The value in the Time column can show how late the replica is compared to the source.If sufficient time elapses on the source side without activity on the Binlog Dump thread, the source determines that the replica is no longer connected. As for any other client connection, the timeouts for this depend on the values of net_write_timeout and net_retry_count.

7.Monitoring Replication Applier Worker Threads

On a multithreaded replica, the Performance Schema tables replication_applier_status_by_coordinator and replication_applier_status_by_worker show status information for the replica's coordinator thread and applier worker threads respectively. For a replica with multiple channels, the threads for each channel are identified.

A multithreaded replica's coordinator thread also prints statistics to the replica's error log on a regular basis if the verbosity setting is set to display informational messages. The statistics are printed depending on the volume of events that the coordinator thread has assigned to applier worker threads, with a maximum frequency of once every 120 seconds. The message lists the following statistics for the relevant replication channel, or the default replication channel (which is not named):

statistics	描述
Seconds elapsed	The difference in seconds between the current time and the last time this information was printed to the error log.
Events assigned	The total number of events that the coordinator thread has queued to all applier worker threads since the coordinator thread was started.
Worker queues filled over overrun level	The current number of events that are queued to any of the applier worker threads in excess of the overrun level, which is set at 90% of the maximum queue length of 16384 events. If this value is zero, no applier worker threads are operating at the upper limit of their capacity.
Waited due to worker queue full	The number of times that the coordinator thread had to wait to schedule an event because an applier worker thread's queue was full. If this value is zero, no applier worker threads exhausted their capacity.
Waited due to the total size	The number of times that the coordinator thread had to wait to schedule an event because the replica_pending_jobs_size_max or slave_pending_jobs_size_max limit had been reached. This system variable sets the maximum amount of memory (in bytes) available to applier worker thread queues holding events not yet applied. If an unusually large event exceeds this size, the transaction is held until all the applier worker threads have empty queues, and then processed. All subsequent transactions are held until the large transaction has been completed.
Waited at clock conflicts	The number of nanoseconds that the coordinator thread had to wait to schedule an event because a transaction that the event depended on had not yet been committed. If replica_parallel_type or slave_parallel_type is set to DATABASE (rather than LOGICAL_CLOCK), this value is always zero.
Waited (count) when workers occupied	The number of times that the coordinator thread slept for a short period, which it might do in two situations. The first situation is where the coordinator thread assigns an event and finds the applier worker thread's queue is filled beyond the underrun level of 10% of the maximum queue length, in which case it sleeps for a maximum of 1 millisecond. The second situation is where replica_parallel_type or slave_parallel_type is set to LOGICAL_CLOCK and the coordinator thread needs to assign the first event of a transaction to an applier worker thread's queue, it only does this to a worker with an empty queue, so if no queues are empty, the coordinator thread sleeps until one becomes empty.
Waited when workers occupied	The number of nanoseconds that the coordinator thread slept while waiting for an empty applier worker thread queue (that is, in the second situation described above, where replica_parallel_type or slave_parallel_type is set to LOGICAL_CLOCK and the first event of a transaction needs to be assigned).

8.The Relay Log

The replica's relay log, which is written by the replication I/O (receiver) thread, contains the transactions read from the replication source server's binary log. The transactions in the relay log are applied on the replica by the replication SQL (applier) thread.

The relay log, like the binary log, consists of a set of numbered files containing events that describe database changes, and an index file that contains the names of all used relay log files. The default location for relay log files is the data directory.

The term “relay log file” generally denotes an individual numbered file containing database events. The term “relay log” collectively denotes the set of numbered relay log files plus the index file.

Relay log files have the same format as binary log files and can be read using mysqlbinlog【可被解析讀取】.If binary log transaction compression (available as of MySQL 8.0.20) is in use, transaction payloads written to the relay log are compressed in the same way as for the binary log.

For the default replication channel, relay log file names have the default form host_name-relaybin.nnnnnn, where host_name is the name of the replica server host and nnnnnn is a sequence number. Successive relay log files are created using successive sequence numbers, beginning with 000001. For non-default replication channels, the default base name is host_name-relaybin-channel, where channel is the name of the replication channel recorded in the relay log.--【檔案命名格式】

The replica uses an index file to track the relay log files currently in use. The default relay log index file name is host_name-relay-bin.index for the default channel, and host_name-relaybin-channel.index for non-default replication channels.

If a replica uses the default host-based relay log file names, changing a replica's host name after replication has been set up can cause replication to fail with the errors Failed to open the relay log and Could not find target log during relay log initialization. This is a known issue (see Bug #2122). If you anticipate that a replica's host name might change in the future (for example, if networking is set up on the replica such that its host name can be modified using DHCP), you can avoid this issue entirely by using the relay_log and relay_log_index system variables to specify relay log file names explicitly when you initially set up the replica. This causes the names to be independent of server host name changes.

9.Replication Metadata Repositories【riˈpɑzəˌtɔriz 倉庫;存放處;貯藏室;智囊;知識寶典;學識淵博的人;】

概述

• The replica's connection metadata repository contains information that the replication receiver thread needs to connect to the replication source server and retrieve transactions from the source's binary log. The connection metadata repository is written to the mysql.slave_master_info table.

• The replica's applier metadata repository contains information that the replication applier thread needs to read and apply transactions from the replica's relay log. The applier metadata repository is written to the mysql.slave_relay_log_info table.

The replica's connection metadata repository and applier metadata repository are collectively known as the replication metadata repositories.

Making replication resilient to unexpected halts. The mysql.slave_master_info and mysql.slave_relay_log_info tables are created using the transactional storage engine InnoDB. Updates to the replica's applier metadata repository table are committed together with the transactions, meaning that the replica's progress information recorded in that repository is always consistent with what has been applied to the database, even in the event of an unexpected server halt.

詳說

A replica server creates two replication metadata repositories, the connection metadata repository and the applier metadata repository. The replication metadata repositories survive a replica server's shutdown. If binary log file position based replication is in use, when the replica restarts, it reads the two repositories to determine how far it previously proceeded in reading the binary log from the source and in processing its own relay log. If GTID-based replication is in use, the replica does not use the replication metadata repositories for that purpose, but does need them for the other metadata that they contain.

• The replica's connection metadata repository contains information that the replication I/O (receiver) thread needs to connect to the replication source server and retrieve transactions from the source's binary log. The metadata in this repository includes the connection configuration, the replication user account details, the SSL settings for the connection, and the file name and position where the replication receiver thread is currently reading from the source's binary log.

• The replica's applier metadata repository contains information that the replication SQL (applier) thread needs to read and apply transactions from the replica's relay log. The metadata in this repository includes the file name and position up to which the replication applier thread has executed the transactions in the relay log, and the equivalent position in the source's binary log. It also includes metadata for the process of applying transactions, such as the number of worker threads and the PRIVILEGE_CHECKS_USER account for the channel.

The connection metadata repository is written to the slave_master_info table in the mysql system schema, and the applier metadata repository is written to the slave_relay_log_info table in the mysql system schema. A warning message is issued if mysqld is unable to initialize the tables for the replication metadata repositories, but the replica is allowed to continue starting. This situation is most likely to occur when upgrading from a version of MySQL that does not support the use of tables for the repositories to one in which they are supported.

RESET REPLICA clears the data in the replication metadata repositories, with the exception of the replication connection parameters (depending on the MySQL Server release).

Before MySQL 8.0, to create the replication metadata repositories as tables, it was necessary to specify master_info_repository=TABLE and relay_log_info_repository=TABLE at server startup. Otherwise, the repositories were created as files in the data directory named master.info and relaylog.info, or with alternative names and locations specified by the --master-info-file option and relay_log_info_file system variable. From MySQL 8.0, creating the replication metadata repositories as tables is the default, and the use of all these system variables is deprecated.

The mysql.slave_master_info and mysql.slave_relay_log_info tables are created using the InnoDB transactional storage engine. Updates to the applier metadata repository table are committed together with the transactions, meaning that the replica's progress information recorded in that repository is always consistent with what has been applied to the database, even in the event of an unexpected server halt.

補充說明

1. Do not attempt to update or insert rows in the mysql.slave_master_info or mysql.slave_relay_log_info tables manually. Doing so can cause undefined behavior, and is not supported. Execution of any statement requiring a write lock on either or both of the slave_master_info and slave_relay_log_info tables is disallowed while replication is ongoing (although statements that perform only reads are permitted at any time).

2. Access privileges for the connection metadata repository table mysql.slave_master_info should be restricted to the database administrator, because it contains the replication user account name and password for connecting to the source. Use a restricted access mode to protect database backups that include this table. From MySQL 8.0.21, you can clear the replication user account credentials from the connection metadata repository, and instead always provide them using the START REPLICA statement or START GROUP_REPLICATION statement that starts the replication channel. This approach means that the replication channel always needs operator intervention to restart, but the account name and password are not recorded in the replication metadata repositories.

備份時

When you back up the replica's data or transfer a snapshot of its data to create a new replica, ensure that you include the mysql.slave_master_info and mysql.slave_relay_log_info tables containing the replication metadata repositories. For cloning operations, note that when the replication metadata repositories are created as tables, they are copied to the recipient during a cloning operation, but when they are created as files, they are not copied. When binary log file position based replication is in use, the replication metadata repositories are needed to resume replication after restarting the restored, copied, or cloned replica. If you do not have the relay log files, but still have the applier metadata repository, you can check it to determine how far the replication SQL thread has executed in the source's binary log. Then you can use a CHANGE REPLICATION SOURCE TO statement (from MySQL 8.0.23) or CHANGE MASTER TO statement (before MySQL 8.0.23) with the SOURCE_LOG_FILE | MASTER_LOG_FILE and SOURCE_LOG_POS | MASTER_LOG_POS options to tell the replica to re-read the binary logs from the source from that point (provided that the required binary logs still exist on the source).

One additional repository, the applier worker metadata repository, is created primarily for internal use, and holds status information about worker threads on a multithreaded replica. The applier worker metadata repository includes the names and positions for the relay log file and the source's binary log file for each worker thread. If the applier metadata repository is created as a table, which is the default, the applier worker metadata repository is written to the mysql.slave_worker_info table. If the applier metadata repository is written to a file, the applier worker metadata repository is written to the worker-relaylog.info file. For external use, status information for worker threads is presented in the Performance Schema replication_applier_status_by_worker table.