在MySQL中開啟並行複製後，觀察到延時不會是0，並且也沒有什麼大事務。很多文章已經有總結了，這裡記錄下。

並行以及非並行下的複製延時的計算方式都是下面的程式碼

{
  long time_diff= ((long)(time(0) - mi->rli->last_master_timestamp)
                   - mi->clock_diff_with_master);
  protocol->store((longlong)(mi->rli->last_master_timestamp ? max(0L, time_diff) : 0));}

不同的就是last_master_timestamp 設定，在非並行或並行情況下 last_master_timestamp==0的情況下，（ last_master_timestamp==0的情況出現在gaq佇列為空的場景）

這個值的設定如下，在執行relay_log_event的時候設定

rli->last_master_timestamp= ev->common_header->when.tv_sec +
(time_t) ev->exec_time;

是binlog log_event_header 的時間 + event執行的時間

那在並行複製的情況下 last_master_timestamp 值的設定是在函式mts_checkpoint_routine中設定，這個函式是執行checkpoint，處理gaq頭任務，獲取lwm

/*
    Update the rli->last_master_timestamp for reporting correct Seconds_behind_master.
    If GAQ is empty, set it to zero.
    Else, update it with the timestamp of the first job of the Slave_job_queue
    which was assigned in the Log_event::get_slave_worker() function.
  */
ts= rli->gaq->empty()
? 0
: reinterpret_cast<Slave_job_group*>(rli->gaq->head_queue())->ts;
rli->reset_notified_checkpoint(cnt, ts, need_data_lock, true);
  /* end-of "Coordinator::"commit_positions" */

在gaq空的情況下設定成0 ，否則設定成Slave_job_queue 第一個job的時間

函式 mts_checkpoint_routine 是在next_event中呼叫，根據checkpoint_group 和mts_checkpoint_period引數判斷是否執行 mts_checkpoint_routine

bool force= (rli->checkpoint_seqno > (rli->checkpoint_group - 1));
if (rli->is_parallel_exec() && (opt_mts_checkpoint_period != 0 || force))
{
ulonglong period= static_cast<ulonglong>(opt_mts_checkpoint_period * 1000000ULL);
mysql_mutex_unlock(&rli->data_lock);
        /*
          At this point the coordinator has is delegating jobs to workers and
          the checkpoint routine must be periodically invoked.
        */
(void) mts_checkpoint_routine(rli, period, force, true/*need_data_lock=true*/); // TODO: ALFRANIO ERROR
DBUG_ASSERT(!force ||
(force && (rli->checkpoint_seqno <= (rli->checkpoint_group - 1))) ||
sql_slave_killed(thd, rli));
mysql_mutex_lock(&rli->data_lock);
}

如果間隔小，就不執行checkpoint，不更新 last_master_timestamp

if (!force && diff < period)
{
    /*
      We do not need to execute the checkpoint now because
      the time elapsed is not enough.
    */
DBUG_RETURN(FALSE);
}

如果checkpoint沒有做，延誤了，導致event沒有及時處理，那麼這個last_master_timestamp就會相對舊，導致出現延時的情況。

MySQL並行複製延時時間不準確

相關文章