InnoDB IO路徑原始碼分析

jesselyu發表於2017-04-07

InnoDB實現IO Flush透過“os_file_flush”宏收斂,macro展開後為”os_file_flush_func”。

接下來,我們重點看一下,還有其它哪些場景會呼叫到這個os_file_flush_func函式:

1. buf_dblwr_init_or_load_pages中,在double buffer 做crash恢復時,如果設定了reset_space_ids為”true”

2. fil_create_new_single_table_tablespace中,建立表資料檔案時

3. fil_user_tablespace_restore_page中,從double writer buffer中copy page時

4. fil_tablespace_iterate中,如果iterate page失敗時,會做一次sync

5. os_file_set_size中,在檔案新建立時初始化檔案大小時

從上面可以看出,直接呼叫os_file_flush_func還是非常少的。那麼系統中還有一些是呼叫os_file_flush_func的上層函式fil_flush:

1. buf_dblwr_flush_buffered_writes中,flush double write buffer到disk

2. buf_dblwr_write_single_page中,單頁flush到disk中

3. buf_flush_write_block_low中,flush block到disk中,一般在調double write buffer“buf_dblwr_flush_buffered_writes“之後呼叫

4. buf_LRU_remove_pages中,delete某個tablespace中的page時,做sync呼叫

5. fil_rename_tablespace中,rename一個tablespace時呼叫

6. fil_extend_space_to_desired_size中,extend space時

7. fil_flush_file_spaces中,flush 批次tablespace的page時,被ha_innodb的force checkpoint時呼叫,shutdown DB時呼叫等

8. log_io_complete中,log io完成時,checkpoint write以及其它,是否呼叫受sync方式影響,非commit依賴

9. log_write_up_to中,commit事務時,這個是重點,innodb 的log file都會呼叫,而且是同步

10. create_log_files_rename中,rename log file時

11. innobase_start_or_create_for_mysql中,建立新的檔案時

在fil_flush函式中,所有檔案的flush cache行為受fil_system->mutex保護。因此不管是data file還是log file,檔案級別的flush是串化的。那麼具體是怎麼來控制的呢?

1. 檢查file node中的n_pending_flushes,如果大於“0”就一直retry

2. 如果為“0”,進入flush階段,在正式開始flush之前,先將n_pending_flushe加“1”,這個操作受上文提到的“fil->mutex“保護

3. 呼叫“os_file_flush“ flush完成之後,n_pending_flushes減“1”,也同樣由“fil->mutex“保護

下面是MySQL中retry的程式碼:

retry:

if (node->n_pending_flushes > 0) {

/* We want to avoid calling os_file_flush() on

            the file twice at the same time,   because we do

            not know what bugs OS's may   contain in file

            i/o */

ib_int64_t sig_count =

                os_event_reset(node->sync_event);

            mutex_exit(&fil_system->mutex);

            os_event_wait_low(node->sync_event, sig_count);

            mutex_enter(&fil_system->mutex);

if (node->flush_counter >= old_mod_counter) {

goto skip_flush;

            }

goto retry;

        }

下面是MySQL中flush的程式碼:

        ut_a(node->open);

        file = node->handle;

        node->n_pending_flushes++;

        mutex_exit(&fil_system->mutex);

        os_file_flush(file);

        mutex_enter(&fil_system->mutex);

        os_event_set(node->sync_event);

        node->n_pending_flushes--;

        node->flush_size = node->size;

上面提到的邏輯,對於log file也同樣處理。Log file的tablespace為log space。對於log file,還受到log_sys->mutex的保護,在log_write_up_to函式中。

Log file的flush行為收斂到“log_write_up_to”函式體中,再呼叫fil_flush,最終走到os_file_flush_func。

MySQL在commit的時候,如果”innodb_flush_log_at_trx_commit=1”時,呼叫兩次同步的log_write_up_to,一次是innobase log file 的flush,一次是bin log的flush。

如果關掉bin log,則在ordered_commit函式中,不會走sync_blog分支。以下是關掉和不關掉時,innobase flush log file的執行路徑。

不帶binlog時的Innobase 的pstack如下:

fsync,os_file_fsync,os_file_flush_func,fil_flush,log_write_up_to,trx_flush_log_if_needed_low,trx_flush_log_if_needed,trx_commit_complete_for_mysql,innobase_commit,ha_commit_low,TC_LOG_DUMMY::commit,ha_commit_trans,trans_commit_stmt,mysql_execute_command,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone

帶Binlog的flush pstack如下:

fsync,os_file_fsync,os_file_flush_func,fil_flush,log_write_up_to,innobase_flush_logs,flush_handlerton,plugin_foreach_with_mask,ha_flush_logs,MYSQL_BIN_LOG::process_flush_stage_queue,MYSQL_BIN_LOG::ordered_commit,MYSQL_BIN_LOG::commit,ha_commit_trans,trans_commit_stmt,mysql_execute_command,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone

從上面大致的呼叫來看,log_write_up_to()和log_io_complete()將是重點,因為這些flush在file級別是序列的,commit時的rt主要由這些串化帶來。

接下來我們看一下log_write_up_to的呼叫者都有那些:

1. buf_flush_write_block_low,在force flush中呼叫,保證日誌必須先於資料落地,刷髒頁時,由page_cleaner_do_flush_batch()發起呼叫到此函式

2. innobase_flush_logs,在commit時呼叫,主要由flush binlog分支呼叫

3. trx_flush_log_if_needed_low,在commit時呼叫,主要由flush innodb log file時呼叫

4. log_buffer_flush_to_disk,log buffer刷日誌到disk

5. log_buffer_sync_in_background,後臺執行緒同步log buffer到disk,由srv_master_thread 執行緒調srv_sync_log_buffer_in_background()呼叫到,每秒一次

6. log_flush_margin,主要為騰挪log buffer空間時呼叫

7. log_checkpoint,主要在做checkpoint時呼叫到,由srv_master執行緒呼叫srv_master_do_idle_tasks(),每秒做一次

log_io_complete()函式的呼叫情況:

1. fil_aio_wait,aio wait中如果是log io將會呼叫此方法

從上面分析可以看到,主要影響RT比較嚴重的還是因為刷髒頁導致的log_sys->mutex爭用。另外,log_buffer_sync_in_background和log_checkpoint,這兩個都是由後臺srv_master_thread執行緒每隔一秒呼叫到。

但是這兩個方法不一定會執行fil_flush,所以不是影響的主因。gdb掛上去後,大致會走到fil_flush爭用的pstack如下:


Breakpoint 13, fil_flush   (space_id=4294967280, from=FLUSH_FROM_LOG_WRITE_UP_TO) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=4294967280, from=FLUSH_FROM_LOG_WRITE_UP_TO) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000c890e5 in log_write_up_to (lsn=,   wait=, flush_to_disk=1, caller=) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:1674

#2    0x0000000000c45330 in innobase_flush_logs (hton=,   target_lsn=) at /u01/mysql-5.6/storage/innobase/handler/ha_innodb.cc:4210

#3    0x00000000006a80b9 in flush_handlerton (thd=thd@entry=0x0,   plugin=, arg=arg@entry=0x7fc6b802f760) at   /u01/mysql-5.6/sql/handler.cc:2385

#4    0x0000000000819312 in plugin_foreach_with_mask (thd=0x0,   func=func@entry=0x6a8050 ,   type=type@entry=1, state_mask=4294967287, state_mask@entry=8,   arg=0x7fc6b802f760)

      at /u01/mysql-5.6/sql/sql_plugin.cc:2168

#5    0x00000000006aba87 in ha_flush_logs (db_type=db_type@entry=0x0, engine_map=) at /u01/mysql-5.6/sql/handler.cc:2395

#6    0x00000000009cd7c4 in MYSQL_BIN_LOG::process_flush_stage_queue   (this=this@entry=0x16c3100 ,   total_bytes_var=total_bytes_var@entry=0x7fc6e40cfac0,   rotate_var=rotate_var@entry=0x7fc6e40cfab0,

      out_queue_var=out_queue_var@entry=0x7fc6e40cfad0,   async=async@entry=false) at /u01/mysql-5.6/sql/binlog.cc:7120

#7    0x00000000009cd967 in MYSQL_BIN_LOG::ordered_commit   (this=this@entry=0x16c3100 , thd=thd@entry=0x2165100, all=all@entry=false,   skip_commit=skip_commit@entry=false, async=async@entry=false)

      at /u01/mysql-5.6/sql/binlog.cc:7587

#8    0x00000000009ce69d in MYSQL_BIN_LOG::commit (this=0x16c3100   , thd=0x2165100, all=,   async=) at /u01/mysql-5.6/sql/binlog.cc:7021

#9    0x00000000006b1889 in ha_commit_trans (thd=thd@entry=0x2165100,   all=all@entry=false, async=,   ignore_global_read_lock=ignore_global_read_lock@entry=false) at   /u01/mysql-5.6/sql/handler.cc:1539

#10 0x0000000000895ecd in   trans_commit_stmt (thd=thd@entry=0x2165100, async=) at   /u01/mysql-5.6/sql/transaction.cc:452

#11 0x0000000000808670 in   mysql_execute_command (thd=thd@entry=0x2165100,   statement_start_time=statement_start_time@entry=0x7fc6e40d1950,   post_parse=post_parse@entry=0x7fc6e40d1b10) at   /u01/mysql-5.6/sql/sql_parse.cc:5922

#12 0x000000000080f6be in mysql_parse   (thd=thd@entry=0x2165100, rawbuf=, length=, parser_state=parser_state@entry=0x7fc6e40d1c40,   last_timer=last_timer@entry=0x7fc6e40d1b10,

      async_commit=async_commit@entry=0x7fc6e40d1af0 "",   db="",   exit_admission_control=exit_admission_control@entry=0x7fc6e40d1b00   "") at /u01/mysql-5.6/sql/sql_parse.cc:7677

#13 0x0000000000811136 in   dispatch_command (command=COM_QUERY, thd=0x2165100, packet=0x2178b71   "update sbtest.sbtest1 set k=k+1 where id=1",   packet_length=) at /u01/mysql-5.6/sql/sql_parse.cc:1691

#14 0x0000000000813003 in do_command   (thd=) at /u01/mysql-5.6/sql/sql_parse.cc:1133

#15 0x00000000007d705f in   do_handle_one_connection (thd_arg=) at   /u01/mysql-5.6/sql/sql_connect.cc:1108

#16 0x00000000007d71d9 in   handle_one_connection (arg=) at   /u01/mysql-5.6/sql/sql_connect.cc:929

#17 0x00007fd950ef9dc5 in start_thread ()   from /lib64/libpthread.so.0

#18 0x00007fd94ef4a28d in clone () from   /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6e5957700 (LWP   39002)]

Breakpoint 15,   log_buffer_sync_in_background (flush=flush@entry=1) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:1749

1749      {

(gdb) bt

#0    log_buffer_sync_in_background (flush=flush@entry=1) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:1749

#1    0x0000000000d06c33 in srv_sync_log_buffer_in_background () at   /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:2757

#2    srv_master_do_active_tasks () at /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:2887

#3    srv_master_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:3145

#4    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#5    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

Breakpoint 14, log_checkpoint   (sync=sync@entry=1, write_always=write_always@entry=0) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:2154

2154      {

(gdb) bt

#0    log_checkpoint (sync=sync@entry=1, write_always=write_always@entry=0)   at /u01/mysql-5.6/storage/innobase/log/log0log.cc:2154

#1    0x0000000000d07199 in srv_master_do_active_tasks () at   /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:2929

#2    srv_master_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:3145

#3    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#4    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6dd7fa700 (LWP   39007)]

Breakpoint 13, fil_flush   (space_id=4294967280, from=FLUSH_FROM_LOG_WRITE_UP_TO) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=4294967280, from=FLUSH_FROM_LOG_WRITE_UP_TO) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000c890e5 in log_write_up_to (lsn=,   wait=, flush_to_disk=1, caller=) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:1674

#2    0x0000000000d79d26 in buf_flush_write_block_low (sync=false,   flush_type=BUF_FLUSH_LIST, bpage=0x7fd706332330) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:902

#3    buf_flush_page (buf_pool=, bpage=0x7fd706332330,   flush_type=BUF_FLUSH_LIST, sync=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1061

#4    0x0000000000d7a43e in buf_flush_try_neighbors (space=0,   offset=offset@entry=5, flush_type=flush_type@entry=BUF_FLUSH_LIST,   n_flushed=n_flushed@entry=1, n_to_flush=n_to_flush@entry=250)

      at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1271

#5    0x0000000000d7b1b1 in buf_flush_page_and_try_neighbors   (flush_type=BUF_FLUSH_LIST, count=, n_to_flush=250,   bpage=) at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1355

#6    buf_do_flush_list_batch (buf_pool=buf_pool@entry=0x1f817d8,   min_n=min_n@entry=250, lsn_limit=lsn_limit@entry=18446744073709551615) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1623

#7    0x0000000000d7b308 in buf_flush_batch (flush_type=BUF_FLUSH_LIST,   lsn_limit=18446744073709551615, min_n=,   buf_pool=0x1f817d8) at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1693

#8    buf_flush_list (min_n=,   n_processed=n_processed@entry=0x7fc6dd7f9bd8, lsn_limit=18446744073709551615)   at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1939

#9    0x0000000000d7c79b in page_cleaner_do_flush_batch   (lsn_limit=18446744073709551615, n_to_flush=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2216

#10 buf_flush_page_cleaner_thread   (arg=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2588

#11 0x00007fd950ef9dc5 in start_thread ()   from /lib64/libpthread.so.0

#12 0x00007fd94ef4a28d in clone () from   /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6dcff9700 (LWP   39008)]

Breakpoint 13, fil_flush (space_id=0,   from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=0, from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000d7636c in buf_dblwr_flush_buffered_writes () at   /u01/mysql-5.6/storage/innobase/buf/buf0dblwr.cc:1135

#2    0x0000000000d7c419 in buf_flush_end (flush_type=BUF_FLUSH_LRU,   buf_pool=0x1f817d8) at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1794

#3    buf_flush_LRU_tail () at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2150

#4    0x0000000000d7d446 in buf_flush_lru_manager_thread (arg=) at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2712

#5    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#6    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6e995f700 (LWP   38990)]

Breakpoint 13, fil_flush (space_id=104,   from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=104, from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000db8970 in fil_flush_file_spaces   (purpose=purpose@entry=501, from=from@entry=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6668

#2    0x0000000000d75f0b in buf_dblwr_update (bpage=bpage@entry=0x7fd70665d550,   flush_type=) at   /u01/mysql-5.6/storage/innobase/buf/buf0dblwr.cc:803

#3    0x0000000000d79699 in buf_flush_write_complete   (bpage=bpage@entry=0x7fd70665d550) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:691

#4    0x0000000000d71e16 in buf_page_io_complete (bpage=0x7fd70665d550,   evict=evict@entry=false) at   /u01/mysql-5.6/storage/innobase/buf/buf0buf.cc:6157

#5    0x0000000000db5ac0 in fil_aio_wait (segment=segment@entry=10) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6460

#6    0x0000000000d09ba0 in io_handler_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0start.cc:498

#7    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#8    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6dcff9700 (LWP   39008)]

Breakpoint 13, fil_flush (space_id=0,   from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=0, from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000d7636c in buf_dblwr_flush_buffered_writes () at   /u01/mysql-5.6/storage/innobase/buf/buf0dblwr.cc:1135

#2    0x0000000000d7c41e in buf_flush_common (page_count=, flush_type=BUF_FLUSH_LRU) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1723

#3    buf_flush_LRU_tail () at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2152

#4    0x0000000000d7d446 in buf_flush_lru_manager_thread (arg=) at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2712

#5    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#6    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6e995f700 (LWP   38990)]

Breakpoint 13, fil_flush (space_id=0,   from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=0, from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000db8970 in fil_flush_file_spaces   (purpose=purpose@entry=501, from=from@entry=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6668

#2    0x0000000000d75f0b in buf_dblwr_update (bpage=bpage@entry=0x7fd706332330,   flush_type=) at   /u01/mysql-5.6/storage/innobase/buf/buf0dblwr.cc:803

#3    0x0000000000d79699 in buf_flush_write_complete   (bpage=bpage@entry=0x7fd706332330) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:691

#4    0x0000000000d71e16 in buf_page_io_complete (bpage=0x7fd706332330,   evict=evict@entry=false) at   /u01/mysql-5.6/storage/innobase/buf/buf0buf.cc:6157

#5    0x0000000000db5ac0 in fil_aio_wait (segment=segment@entry=10) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6460

#6    0x0000000000d09ba0 in io_handler_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0start.cc:498

#7    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#8    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6dd7fa700 (LWP   39007)]

Breakpoint 13, fil_flush (space_id=0,   from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=0, from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000d7636c in buf_dblwr_flush_buffered_writes () at   /u01/mysql-5.6/storage/innobase/buf/buf0dblwr.cc:1135

#2    0x0000000000d7b384 in buf_flush_end (flush_type=BUF_FLUSH_LIST,   buf_pool=0x1f81e08) at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1794

#3    buf_flush_list (min_n=,   n_processed=n_processed@entry=0x7fc6dd7f9bd8, lsn_limit=18446744073709551615)   at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1942

#4    0x0000000000d7c79b in page_cleaner_do_flush_batch   (lsn_limit=18446744073709551615, n_to_flush=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2216

#5    buf_flush_page_cleaner_thread (arg=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2588

#6    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#7    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6e915e700 (LWP   38991)]

Breakpoint 13, fil_flush (space_id=104,   from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=104, from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000db8970 in fil_flush_file_spaces   (purpose=purpose@entry=501, from=from@entry=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6668

#2    0x0000000000d75f0b in buf_dblwr_update   (bpage=bpage@entry=0x7fd4c76da870, flush_type=) at   /u01/mysql-5.6/storage/innobase/buf/buf0dblwr.cc:803

#3    0x0000000000d79699 in buf_flush_write_complete   (bpage=bpage@entry=0x7fd4c76da870) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:691

#4    0x0000000000d71e16 in buf_page_io_complete (bpage=0x7fd4c76da870,   evict=evict@entry=false) at   /u01/mysql-5.6/storage/innobase/buf/buf0buf.cc:6157

#5    0x0000000000db5ac0 in fil_aio_wait (segment=segment@entry=11) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6460

#6    0x0000000000d09ba0 in io_handler_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0start.cc:498

#7    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#8    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6dd7fa700 (LWP   39007)]

Breakpoint 13, fil_flush (space_id=0,   from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=0, from=FLUSH_FROM_DOUBLEWRITE) at /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000d7636c in buf_dblwr_flush_buffered_writes () at   /u01/mysql-5.6/storage/innobase/buf/buf0dblwr.cc:1135

#2    0x0000000000d7b384 in buf_flush_end (flush_type=BUF_FLUSH_LIST,   buf_pool=0x1f82a68) at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1794

#3    buf_flush_list (min_n=,   n_processed=n_processed@entry=0x7fc6dd7f9bd8, lsn_limit=18446744073709551615)   at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1942

#4    0x0000000000d7c79b in page_cleaner_do_flush_batch   (lsn_limit=18446744073709551615, n_to_flush=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2216

#5    buf_flush_page_cleaner_thread (arg=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2588

#6    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#7    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6e815c700 (LWP   38993)]

Breakpoint 13, fil_flush (space_id=0,   from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=0, from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000db8970 in fil_flush_file_spaces (purpose=purpose@entry=501,   from=from@entry=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6668

#2    0x0000000000d75f0b in buf_dblwr_update   (bpage=bpage@entry=0x7fd049dfe2a0, flush_type=) at   /u01/mysql-5.6/storage/innobase/buf/buf0dblwr.cc:803

#3    0x0000000000d79699 in buf_flush_write_complete   (bpage=bpage@entry=0x7fd049dfe2a0) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:691

#4    0x0000000000d71e16 in buf_page_io_complete (bpage=0x7fd049dfe2a0,   evict=evict@entry=false) at /u01/mysql-5.6/storage/innobase/buf/buf0buf.cc:6157

#5    0x0000000000db5ac0 in fil_aio_wait (segment=segment@entry=13) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6460

#6    0x0000000000d09ba0 in io_handler_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0start.cc:498

#7    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#8    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6dd7fa700 (LWP   39007)]

Breakpoint 13, fil_flush (space_id=0,   from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=0, from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000d7636c in buf_dblwr_flush_buffered_writes () at   /u01/mysql-5.6/storage/innobase/buf/buf0dblwr.cc:1135

#2    0x0000000000d7b384 in buf_flush_end (flush_type=BUF_FLUSH_LIST,   buf_pool=0x1f83098) at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1794

#3    buf_flush_list (min_n=,   n_processed=n_processed@entry=0x7fc6dd7f9bd8, lsn_limit=18446744073709551615)   at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1942

#4    0x0000000000d7c79b in page_cleaner_do_flush_batch   (lsn_limit=18446744073709551615, n_to_flush=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2216

#5    buf_flush_page_cleaner_thread (arg=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2588

#6    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#7    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6e795b700 (LWP   38994)]

Breakpoint 13, fil_flush (space_id=0,   from=FLUSH_FROM_DOUBLEWRITE) at /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=0, from=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000db8970 in fil_flush_file_spaces   (purpose=purpose@entry=501, from=from@entry=FLUSH_FROM_DOUBLEWRITE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6668

#2    0x0000000000d75f0b in buf_dblwr_update   (bpage=bpage@entry=0x7fce0b251e10, flush_type=) at   /u01/mysql-5.6/storage/innobase/buf/buf0dblwr.cc:803

#3    0x0000000000d79699 in buf_flush_write_complete   (bpage=bpage@entry=0x7fce0b251e10) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:691

#4    0x0000000000d71e16 in buf_page_io_complete (bpage=0x7fce0b251e10,   evict=evict@entry=false) at   /u01/mysql-5.6/storage/innobase/buf/buf0buf.cc:6157

#5    0x0000000000db5ac0 in fil_aio_wait (segment=segment@entry=14) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6460

#6    0x0000000000d09ba0 in io_handler_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0start.cc:498

#7    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#8    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6e5957700 (LWP   39002)]

Breakpoint 15,   log_buffer_sync_in_background (flush=flush@entry=1) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:1749

1749      {

(gdb) bt

#0    log_buffer_sync_in_background (flush=flush@entry=1) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:1749

#1    0x0000000000d07043 in srv_sync_log_buffer_in_background () at   /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:2757

#2    srv_master_do_idle_tasks () at   /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:2992

#3    srv_master_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:3147

#4    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#5    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

Breakpoint 14, log_checkpoint   (sync=sync@entry=1, write_always=write_always@entry=0) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:2154

2154      {

(gdb) bt

#0    log_checkpoint (sync=sync@entry=1, write_always=write_always@entry=0)   at /u01/mysql-5.6/storage/innobase/log/log0log.cc:2154

#1    0x0000000000d06f56 in srv_master_do_idle_tasks () at   /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:3002

#2    srv_master_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:3147

#3    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#4    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

[Switching to Thread 0x7fc6ee168700 (LWP   38981)]

Breakpoint 13, fil_flush   (space_id=4294967280, from=FLUSH_FROM_LOG_IO_COMPLETE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=4294967280, from=FLUSH_FROM_LOG_IO_COMPLETE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000c86c78 in log_io_complete (group=) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:1239

#2    0x0000000000db5a4b in fil_aio_wait (segment=segment@entry=1) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6463

#3    0x0000000000d09ba0 in io_handler_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0start.cc:498

#4    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#5    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.

bt[Switching to Thread 0x7fc6e5957700   (LWP 39002)]

Breakpoint 14, log_checkpoint   (sync=sync@entry=1, write_always=write_always@entry=0) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:2154

2154      {

(gdb) bt

#0    log_checkpoint (sync=sync@entry=1, write_always=write_always@entry=0)   at /u01/mysql-5.6/storage/innobase/log/log0log.cc:2154

#1    0x0000000000d06f56 in srv_master_do_idle_tasks () at   /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:3002

#2    srv_master_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0srv.cc:3147

#3    0x00007fd950ef9dc5 in start_thread () from /lib64/libpthread.so.0

#4    0x00007fd94ef4a28d in clone () from /lib64/libc.so.6

(gdb) c

Continuing.


大部分都是double write buffer的刷出,但是這個不會與log_sys->mutex爭用,我們這裡不關心。下面這個是刷髒頁導致的log flush,這個會跟commit時的log flush爭用。

Breakpoint 13, fil_flush   (space_id=4294967280, from=FLUSH_FROM_LOG_WRITE_UP_TO) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=4294967280, from=FLUSH_FROM_LOG_WRITE_UP_TO) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000c890e5 in log_write_up_to (lsn=,   wait=, flush_to_disk=1, caller=) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:1674

#2    0x0000000000d79d26 in buf_flush_write_block_low (sync=false,   flush_type=BUF_FLUSH_LIST, bpage=0x7fd706332330) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:902

#3    buf_flush_page (buf_pool=, bpage=0x7fd706332330,   flush_type=BUF_FLUSH_LIST, sync=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1061

#4    0x0000000000d7a43e in buf_flush_try_neighbors (space=0,   offset=offset@entry=5, flush_type=flush_type@entry=BUF_FLUSH_LIST,   n_flushed=n_flushed@entry=1, n_to_flush=n_to_flush@entry=250)

      at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1271

#5    0x0000000000d7b1b1 in buf_flush_page_and_try_neighbors   (flush_type=BUF_FLUSH_LIST, count=, n_to_flush=250,   bpage=) at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1355

#6    buf_do_flush_list_batch (buf_pool=buf_pool@entry=0x1f817d8,   min_n=min_n@entry=250, lsn_limit=lsn_limit@entry=18446744073709551615) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1623

#7    0x0000000000d7b308 in buf_flush_batch (flush_type=BUF_FLUSH_LIST,   lsn_limit=18446744073709551615, min_n=,   buf_pool=0x1f817d8) at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1693

#8    buf_flush_list (min_n=,   n_processed=n_processed@entry=0x7fc6dd7f9bd8, lsn_limit=18446744073709551615)   at /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:1939

#9    0x0000000000d7c79b in page_cleaner_do_flush_batch   (lsn_limit=18446744073709551615, n_to_flush=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2216

#10 buf_flush_page_cleaner_thread   (arg=) at   /u01/mysql-5.6/storage/innobase/buf/buf0flu.cc:2588

#11 0x00007fd950ef9dc5 in start_thread ()   from /lib64/libpthread.so.0

#12 0x00007fd94ef4a28d in clone () from   /lib64/libc.so.6

Breakpoint 5, fil_flush   (space_id=4294967280, from=FLUSH_FROM_LOG_IO_COMPLETE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

6478      {

(gdb) bt

#0    fil_flush (space_id=4294967280, from=FLUSH_FROM_LOG_IO_COMPLETE) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6478

#1    0x0000000000c86c78 in log_io_complete (group=) at   /u01/mysql-5.6/storage/innobase/log/log0log.cc:1239

#2    0x0000000000db5a4b in fil_aio_wait (segment=segment@entry=1) at   /u01/mysql-5.6/storage/innobase/fil/fil0fil.cc:6463

#3    0x0000000000d09ba0 in io_handler_thread (arg=) at   /u01/mysql-5.6/storage/innobase/srv/srv0start.cc:498

#4    0x00007fb818efddc5 in start_thread () from /lib64/libpthread.so.0

#5    0x00007fb816f4e28d in clone () from /lib64/libc.so.6

1: *node = {space = 0x30aa218, name =   0x30aa948 "/mnt/mysql-redo/my3308/data/ib_logfile0", open = 1,   handle = 10, sync_event = 0x30aa980, is_raw_disk = 0, size = 262144,   n_pending = 0, n_pending_flushes = 0,

    being_extended = 0, modification_counter = 49, flush_counter = 41,   flush_size = 262144, chain = {prev = 0x0, next = 0x30aaa78}, LRU = {prev =   0x0, next = 0x0}, magic_n = 89389}


因此,總結起來,應該是commit做的兩次fsync加上一次page cleaner做的log_write_up_to()。另外,還有一個fil_aio_wait完成時,如果是log io,就會做一次log_io_complete()。

這四次fsync都會對使用者的rt有影響,commit的兩次無可避免,後面兩次最多也就是調整頻率。另外是否可以改變fsync的方式?這個讀者可以思考。

比oracle實現差,oracle不會作強制的,我記得是給一個標記,策略還是按redo自己的策略來做。oracle的實現應該是考慮到這一點了.

附:facebook的最佳化,http://yoshinorimatsunobu.blogspot.com/2009/05/overwriting-is-much-faster-than_28.html

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30088583/viewspace-2136828/,如需轉載,請註明出處,否則將追究法律責任。

相關文章