mysql併發執行緒控制之控制thread_running數量

myownstars發表於2015-03-08

前面兩篇帖子分別總結了innodb_thread_concurrencythread pool的原理:

前者是在儲存引擎層面限制併發執行的執行緒數,程式碼路徑過於靠後,此時query已在server層完成解析;

後者則是在server層建立多組常駐執行緒,用於接收客戶端連線傳送的query並代為執行,而不是為每個連線單獨建立一個執行緒。

 

除了這兩種解決方案,還可以在server層進行running thread數量判斷,如果達到閾值則直接報錯或sleep

下面介紹一下其實現原理和patch原始碼,資料來源http://www.gpfeng.com/?p=434

 

thread_running的意義

thread_running狀態變數記錄了當前併發執行stmt/command的數量,執行前加1執行後減1

程式碼邏輯

do_command

--&gtdispatch_command

    ...

    inc_thread_running

    ...

    mysql_execute_command or execute_some_command

    ...

    dec_thread_running

    ...

 

Thread_running突然飆高的誘因:

1 客戶端連線暴增;

2 系統效能瓶頸,如CPU,IO或者mem swap

3 異常sql

往往在這種情況下,MySQL server會表現出hang住的假象。

 

 

解決方案

暫時禁止新sql執行,為此引入兩個閾值low_watermarkhigh_watermark,以及變數threads_running_ctl_mode(selects或者all )

執行query前,檢查thread_running

若其已達high_watermark閾值則直接拒絕執行並返回錯誤:mysql server is too busy

若其位於lowhigh之間,則sleep 5ms,然後繼續嘗試,累計等待100ms後則執行

3  對於已經開啟事務和super使用者,不做限制

4  threads_running_ctl_mode控制query型別:SELECTS/ALL,預設為SELECTS,表示隻影響SELECT語句

Patch部分原始碼見注1

 

 

進一步改進

http://www.gpfeng.com/?p=499

將低水位限流從sleep-retry優化為基於FIFOcond-wait/signal(實現8FIFO)

1 高水位限流(這點保持不變)

2 低水位優化;其他解決方案:mariadb開發thread poolpercona在其上實現了優先佇列;

本patch優勢:思路與thread pool一致,但程式碼更簡潔(不到1000);而且增加了特定query的過濾;

Patch部分程式碼見注2

低水位優化細節

1 新增thread_active記錄併發執行緒數,位於mysql_execute_command(sql解析之後)高水位則在query解析之前判斷

Thread_active只統計select/DML,而commit/rollback則放過。

2 採用FIFO,當thread_active >= thread_running_low_watermark時程式進入FIFO等待,其他執行緒執行完sql後喚醒FIFO

保證併發執行緒控制在thread_running_low_watermark內,同時引入threads_running_wait_timeout控制執行緒在FIFO最大等待時間,超時則直接報錯返回。

3 引入8FIFO,降低了進出FIFO的鎖競爭,執行緒採用RR分配到不同fifo,每個佇列限制併發執行執行緒為threads_running_low_watermark/8

 

已經通過高水位驗證的thread,開始執行query[解析後進行低水位判斷,若通過則執行],執行當前sql完畢後,thread可能發起新query,則重複[]過程。

 

新增系統變數

threads_running_wait_timeout:進入FIFO排隊最長時間,等待超時後sql被拒,預設100,單位為毫秒ms

新增狀態變數

threads_active: 當前併發SELECT/INSERT/UPDATE/DELETE執行的執行緒數目;

threads_wait:當前進入到FIFO中等待的執行緒數目;

 

測試效果

./sysbench --test=tests/db/select.lua --max-requests=0 --mysql-host=myxxxx.cm3 --mysql-user=test --mysql-table-engine=innodb --oltp-table-size=5000000 --oltp-tables-count=32

normal mysql-0 : 未打補丁版本,設定innodb_thread_concurrency=0

normal mysql-1 : 未打補丁版本,innodb_thread_concurrency=32

patched mysql : 低水位限流補丁版本(活躍執行緒數不超過64

 

 

 

1

http://www.gpfeng.com/wp-content/uploads/2013/09/threads_running_control.txt

+static my_bool thread_running_control(THD *thd, ulong tr)
+{
+  int slept_cnt= 0;
+  ulong tr_low, tr_high;
+  DBUG_ENTER("thread_running_control");
+  
+  /* 
+    Super user/slave thread will not be affected at any time,
+    transactions that have already started will continue.
+  */
+  if ( thd->security_ctx->master_access & SUPER_ACL|| --對於super許可權的使用者和已經開啟的事務不做限制
+      thd->in_active_multi_stmt_transaction() ||
+      thd->slave_thread)  
+    DBUG_RETURN(FALSE);
+
+  /* 
+    To promise that tr_low will never be greater than tr_high, 
+    as values may be changed between these two statements.
+    eg. 
+        (low, high) = (200, 500)
+        1. read low = 200
+        2. other sessions: set low = 20; set high = 80
+        3. read high = 80
+    Don't take a lock here to avoid lock contention.
+  */
+  do 
+  {
+    tr_low= thread_running_low_watermark;
+    tr_high= thread_running_high_watermark;
+
+  } while (tr_low > tr_high);
+
+check_buzy:

+  /* tr_high is promised to be non-zero.*/ 
+  if ((tr_low == 0 && tr < tr_high) || (tr_low != 0 && tr < tr_low))
+    DBUG_RETURN(FALSE);
+  
+  if (tr >= tr_high)
+  { 
+    int can_reject= 1;
+
+    /* thread_running_ctl_mode: 0 -> SELECTS, 1 -> ALL. */
+    if (thread_running_ctl_mode == 0)
+    {
+      int query_is_select= 0;
+      if (thd->query_length() >= 8)
+      {
+        char *p= thd->query();  --讀取query text的前6個字元,以判斷是否為select
+        if (my_toupper(system_charset_info, p[0]) == 'S' &&
+            my_toupper(system_charset_info, p[1]) == 'E' &&
+            my_toupper(system_charset_info, p[2]) == 'L' &&
+            my_toupper(system_charset_info, p[3]) == 'E' &&
+            my_toupper(system_charset_info, p[4]) == 'C' &&
+            my_toupper(system_charset_info, p[5]) == 'T')
+
+          query_is_select= 1;
+      }
+
+      if (!query_is_select)
+        can_reject= 0;
+    }
+
+    if (can_reject)
+    {
+      inc_thread_rejected();
+      DBUG_RETURN(TRUE);
+    }
+    else
+      DBUG_RETURN(FALSE);
+  }
+    
+  if (tr_low != 0 && tr >= tr_low)
+  {
+    /* 
+      If total slept time exceed 100ms and thread running does not
+      reach high watermark, let it in.
+    */
+    if (slept_cnt >= 20)
+      DBUG_RETURN(FALSE);
+    
+    dec_thread_running()
+    
+    /* wait for 5ms. */
+    my_sleep(5000UL); 
+
+    slept_cnt++;
+    tr= inc_thread_running() - 1;
+    
+    goto check_buzy;
+  }
+
+  DBUG_RETURN(FALSE);
+}
+
+/**
   Perform one connection-level (COM_XXXX) command.
   @param command         type of command to perform
@@ -1016,7 +1126,8 @@
   thd->set_query_id(get_query_id());
   if (!(server_command_flags[command] & CF_SKIP_QUERY_ID))
     next_query_id();
-  inc_thread_running();
+  /* remember old value of thread_running for *thread_running_control*. */
+  int32 tr= inc_thread_running() - 1;
   if (!(server_command_flags[command] & CF_SKIP_QUESTIONS))
     statistic_increment(thd->status_var.questions, &LOCK_status);


@@ -1129,6 +1240,13 @@
   {
     if (alloc_query(thd, packet, packet_length))
       break;                                 // fatal error is set
+
+    if (thread_running_control(thd, (ulong)tr))
+    {
+      my_error(ER_SERVER_THREAD_RUNNING_TOO_HIGH, MYF(0));
+      break;
+    }
+
     MYSQL_QUERY_START(thd->query(), thd->thread_id, (char *) (thd->db ? thd->db : ""),  &thd->security_ctx->priv_user[0])



注2 
http://www.gpfeng.com/wp-content/uploads/2014/01/tr-control.diff_.txt  
+/**
   Perform one connection-level (COM_XXXX) command.
 
   @param command         type of command to perform
@@ -1177,7 +1401,7 @@
     command= COM_SHUTDOWN;
   }
   thd->set_query_id(next_query_id());
-  inc_thread_running();
+  int32 tr= inc_thread_running();
 
   if (!(server_command_flags[command] & CF_SKIP_QUESTIONS))
     statistic_increment(thd->status_var.questions, &LOCK_status);
@@ -1209,6 +1433,15 @@
     goto done;
   }
 
+  if (command == COM_QUERY && alloc_query(thd, packet, packet_length))
+    goto endof_case;                 // fatal error is set
+
+  if (thread_running_control_high(thd, tr))
+  {
+    my_error(ER_SERVER_THREAD_RUNNING_TOO_HIGH, MYF(0));
+    goto endof_case;
+  }
+
   switch (command) {
   case COM_INIT_DB:
   {
@@ -1311,8 +1544,6 @@
   }
   case COM_QUERY:
   {
-    if (alloc_query(thd, packet, packet_length))
-      break;                                 // fatal error is set
     MYSQL_QUERY_START(thd->query(), thd->thread_id,
                       (char *) (thd->db ? thd->db : ""),
                       &thd->security_ctx->priv_user[0],
@@ -1751,6 +1982,7 @@
     my_message(ER_UNKNOWN_COM_ERROR, ER(ER_UNKNOWN_COM_ERROR), MYF(0));
     break;
   }
+endof_case:
 
 done:
   DBUG_ASSERT(thd->derived_tables == NULL &&
@@ -2502,12 +2734,37 @@
   Opt_trace_array trace_command_steps(&thd->opt_trace, "steps");
 
   DBUG_ASSERT(thd->transaction.stmt.cannot_safely_rollback() == FALSE);
+  bool count_active= false;
 
   if (need_traffic_control(thd, lex->sql_command))
   {
     thd->killed = THD::KILL_QUERY;
     goto error;
   }
+
+  switch (lex->sql_command) {
+
+  case SQLCOM_SELECT:
+  case SQLCOM_UPDATE:
+  case SQLCOM_UPDATE_MULTI:
+  case SQLCOM_DELETE:
+  case SQLCOM_DELETE_MULTI:
+  case SQLCOM_INSERT:
+  case SQLCOM_INSERT_SELECT:
+  case SQLCOM_REPLACE:
+  case SQLCOM_REPLACE_SELECT:
+    count_active= true;
+    break;
+  default:
+    break;
+  }
+
+  if (count_active && thread_running_control_low_enter(thd))
+  {
+    my_error(ER_SERVER_THREAD_RUNNING_TOO_HIGH, myf(0));
+    goto error;
+  }
+
   status_var_increment(thd->status_var.com_stat[lex->sql_command]);
 
   switch (gtid_pre_statement_checks(thd))
@@ -4990,6 +5247,9 @@
 
 finish:
 
+  if (count_active)
+    thread_running_control_low_exit(thd);
+
   DBUG_ASSERT(!thd->in_active_multi_stmt_transaction() ||
                thd->in_multi_stmt_transaction_mode());
 
 


+static my_bool thread_running_control_high(THD *thd, int32 tr)
+{
+  int32 tr_high;
+  DBUG_ENTER("thread_running_control_high");
+
+  tr_high= (int32)thread_running_high_watermark;
+
+  /* thread_running_ctl_mode: 0 -> SELECTS, 1 -> ALL. */
+  if ((!tr_high || tr <= tr_high) ||
+      thd->transaction.is_active() ||
+      thd->get_command() != COM_QUERY ||
+      thd->security_ctx->master_access & SUPER_ACL ||
+      thd->slave_thread)
+    DBUG_RETURN(FALSE);
+
+  const char *query= thd->query();
+  uint32 len= thd->query_length();
+
+  if ((!has_prefix(query, len, "SELECT", 6) && thread_running_ctl_mode == 0) || --不再是逐個字元判斷
+      has_prefix(query, len, "COMMIT", 6) ||
+      has_prefix(query, len, "ROLLBACK", 8))
+    DBUG_RETURN(FALSE);
+
+  /* confirm again*/
+  if (tr > tr_high && get_thread_running() > tr_high)
+  {
+    __sync_add_and_fetch(&thread_rejected, 1);
+    DBUG_RETURN(TRUE);
+  }
+
+  DBUG_RETURN(FALSE);
+}
+
 


+static my_bool thread_running_control_low_enter(THD *thd)
+{
+  int res= 0;
+  int32 tr_low;
+  my_bool ret= FALSE;
+  my_bool slept= FALSE;
+  struct timespec timeout;
+  Thread_conc_queue *queue;
+  DBUG_ENTER("thread_running_control_low_enter");
+
+  /* update global status */
+  __sync_add_and_fetch(&thread_active, 1);
+
+  tr_low= (int32)queue_tr_low_watermark;
+  queue= thread_conc_queues + thd->query_id % N_THREAD_CONC_QUEUE;
+
+  queue->lock();--問1:在進行低水位判斷前,先鎖定FIFO,避免低水位驗證失敗時無法獲取FIFO鎖進而不能放入FIFO;
+
+retry:
+
+  if ((!tr_low || queue->thread_active < tr_low) ||
+      (thd->lex->sql_command != SQLCOM_SELECT && thread_running_ctl_mode == 0) ||
+      (!slept && (thd->transaction.is_active() ||
+        thd->security_ctx->master_access & SUPER_ACL || thd->slave_thread)))
+  {
+    queue->thread_active++; --判斷是否滿足進入FIFO條件,如不滿足則立即更新thread_active++,解鎖queue並退出;
+    queue->unlock();
+    DBUG_RETURN(ret);
+  }
+
+  if (!slept)
+  {
+    queue->unlock();
+
+    /* sleep for 500 us */
+    my_sleep(500);
+    slept= TRUE;
+    queue->lock();
+
+    goto retry;
+  }
+
+  /* get a free wait-slot */
+  Thread_wait_slot *slot= queue->pop_free();
+
+  /* can't find a free wait slot, must let the query enter */
+  if (!slot)-- 當FIFO都滿了,即無法把當前執行緒放入,則必須放行讓該sql正常執行
+  {
+    queue->thread_active++;
+    queue->unlock();
+    DBUG_RETURN(ret);
+  }
+
+  slot->signaled= false;
+  slot->wait_ended= false;
+
+  /* put slot into waiting queue. */
+  queue->push_back_wait(slot);
+  queue->thread_wait++;
+
+  queue->unlock();
+
+  /* update global status */
+  thd_proc_info(thd, "waiting in server fifo");
+  __sync_sub_and_fetch(&thread_active, 1);
+  __sync_add_and_fetch(&thread_wait, 1);
+
+  /* cond-wait for at most thread_running_wait_timeout(ms). */
+  set_timespec_nsec(timeout, thread_running_wait_timeout_ns);
+
+  mysql_mutex_lock(&slot->mutex);
+  while (!slot->signaled)
+  {
+    res= mysql_cond_timedwait(&slot->cond, &slot->mutex, &timeout);
+    /* no need to signal if cond-wait timedout */
+    slot->signaled= true;
+  }
+  mysql_mutex_unlock(&slot->mutex);
+
+  queue->lock();
+  queue->thread_wait--;
+  queue->thread_active++;
+
+  /* remove slot from waiting queue. */
+  queue->remove_wait(slot);
+  /* put slot into the free queue for reuse. */
+  queue->push_back_free(slot);
+
+  queue->unlock();
+
+  /* update global status */
+  __sync_sub_and_fetch(&thread_wait, 1);
+  __sync_add_and_fetch(&thread_active, 1);
+  thd_proc_info(thd, 0);
+
+  if (res == ETIMEDOUT || res == ETIME)
+  {
+    ret= TRUE; // indicate that query is rejected.
+    __sync_add_and_fetch(&thread_rejected, 1);
+  }
+
+  DBUG_RETURN(ret);
+}


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/15480802/viewspace-1452265/,如需轉載,請註明出處,否則將追究法律責任。

相關文章