MySQL:Innodb恢復的學習筆記
本文只是記錄無可讀性,供自己參考
MySQL · 引擎特性 · InnoDB 崩潰恢復過程
enum { SRV_FORCE_IGNORE_CORRUPT = 1, /*!< let the server run even if it detects a corrupt page */ SRV_FORCE_NO_BACKGROUND = 2, /*!< prevent the main thread from running: if a crash would occur in purge, this prevents it */ SRV_FORCE_NO_TRX_UNDO = 3, /*!< do not run trx rollback after recovery */ SRV_FORCE_NO_IBUF_MERGE = 4, /*!< prevent also ibuf operations: if they would cause a crash, better not do them */ SRV_FORCE_NO_UNDO_LOG_SCAN = 5, /*!< do not look at undo logs when starting the database: InnoDB will treat even incomplete transactions as committed */ SRV_FORCE_NO_LOG_REDO = 6 /*!< do not do the log roll-forward in connection with recovery */};
innodb中的 3個lsn
innodb的lsn和oracle的scn一樣,是一個重要的概念。比如
-
在flush list中正是是使用low lsn作為連結串列的條件
參考buf_page_t中的lsn_t oldest_modification; -
在checkpoint中記錄的也是lsn
參考宏
#define LOG_CHECKPOINT_NO 0#define LOG_CHECKPOINT_LSN 8#define LOG_CHECKPOINT_OFFSET 16#define LOG_CHECKPOINT_LOG_BUF_SIZE 24
-
在物理檔案中每個塊最後的重新整理lsn
參考宏FIL_PAGE_LSN -
在寫日誌落盤的時候也是以lsn為標準的
參考函式log_write_up_to
實際上lsn就是表示的日誌量的位元組數,是一個累加的值,在5.7中表現為:
/* Type used for all log sequence number storage and arithmetics */typedef ib_uint64_t lsn_t;
及一個8位元組非負的整數。最大值及2的64次方。有了這種物理上概念,lsn很容易換算為當前日誌的偏移量。
下面描述一下和檢查點相關的幾個lsn
-
ibdata第一個塊FIL中的lsn(flush lsn):ibdata的26後面8位元組是在innodb 乾淨關閉的時候進行更新的,如果不正常關閉不會進行寫入(FIL_PAGE_FILE_FLUSH_LSN)
-
redolog中MLOG_CHECKPOINT的lsn: redolog MLOG_CHECKPOINT lsn的寫入是在每次checkpoint的時候同步寫入的.乾淨關閉會更新。
-
redolog header中的lsn:是在每次checkpoint的時候非同步寫入的在MLOG_CHECKPOINT寫入之後.乾淨關閉會更新。
我們表示為lsn1/lsn2/lsn3
正常關閉3個lsn是相等的,如果非正常關閉innodb,lsn1不會更新,因此lsn3必然不和lsn1相等,則判定需要進行carsh recovery。
if (checkpoint_lsn != flush_lsn) { ...... if (!recv_needed_recovery) { ib::info() << "The log sequence number " << flush_lsn << " in the system tablespace does not match" " the log sequence number " << checkpoint_lsn << " in the ib_logfiles!"; //出現這個警告說明需要恢復了,因為沒有乾淨的關閉資料庫,那麼flush_lsn一定比checkpoint_lsn小 if (srv_read_only_mode) { ib::error() << "Can't initiate database" " recovery, running in read-only-mode."; log_mutex_exit(); return(DB_READ_ONLY); } recv_init_crash_recovery(); //初始化 } }
MLOG_CHECKPOINT
-
log_checkpoint 函式由master執行緒呼叫,以及關閉資料庫的時候呼叫,不斷的向redo log中寫入MLOG_CHECKPOINT和MLOG_FILE_NAME
log_group_checkpoint 寫入checkpoint資訊到log header
#0 mtr_t::commit_checkpoint (this=0x7fff761fb830, checkpoint_lsn=697558445) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/mtr/mtr0mtr.cc:592#1 0x0000000001ceb4b9 in fil_names_clear (lsn=697558445, do_write=true) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/fil/fil0fil.cc:7067#2 0x0000000001a521cc in log_checkpoint (sync=true, write_always=false) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1927#3 0x0000000001b856f2 in srv_master_do_idle_tasks () at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/srv/srv0srv.cc:2596#4 0x0000000001b85b6b in srv_master_thread (arg=0x0) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/srv/srv0srv.cc:2744#5 0x0000003f74807aa1 in start_thread () from /lib64/libpthread.so.0#6 0x0000003f740e8bcd in clone () from /lib64/libc.so.6
正常innodb執行的情況下 checkpoint是由master執行緒觸發。我們知道髒資料透過page clean執行緒和lru manager執行緒是在不斷寫盤的,那麼在進行異常重啟的的時候我們必須要知道一個恢復的起點,但是這個起點是不能記錄在記憶體中必要固化到磁碟,恢復的時候讀取這個點以後的redo進行恢復,而checkpoint就是完成這個事情下面是checkpoint的執行流程。
正常情況下master會每秒進行檢查點其作用有:(參考log_checkpoint函式)
1、檢查是否有自上次檢查點以來的髒資料寫盤了
2、如果有則在redo裡面會為每個修改過的檔案寫入MLOG_FILE_NAME,完成後寫入一個總的MLOG_CHECKPOINT(參考fil_names_clear函式)
MLOG_FILE_NAME主要記錄至上次檢查點以來更改過的資料檔案
MLOG_CHECKPOINT主要記錄檢查點的lsn
3、如果有則在redo header中寫入相應的檢查點資訊包含(非同步寫)(參考log_group_checkpoint函式)
Log sequence number 697794162
Log flushed up to 697794162
Pages flushed up to 697794162
Last checkpoint at 697794153
697794162-697794153 = 9 剛好是MLOG_CHECKPOINT的長度
oldest_lsn <= log_sys->last_checkpoint_lsn + SIZE_OF_MLOG_CHECKPOINT
SIZE_OF_MLOG_CHECKPOINT=9
redo的寫入有6個途徑:
1、master 執行緒每秒呼叫 棧幀(可能是idle可能是active 和檢測是否需要插入快取合併有關)
2、master 執行緒每秒checkpoint呼叫 (可能是idle可能是active 和檢測是否需要插入快取合併有關)
3、page clean 執行緒呼叫 棧幀 /* Force the log to the disk before writing the modified block */
4、主執行緒commit 呼叫 棧幀
5、innodb shutdown
6、redo buffer不足
1、master執行緒呼叫 棧幀(可能是idle可能是active 和檢測是否需要插入快取合併有關)
#0 log_group_write_buf (group=0x33f29f8, buf=0x7fffa5b38000 "\200\024", len=512, pad_len=0, start_lsn=697764864, new_data_offset=166) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1145#1 0x0000000001a50f95 in log_write_up_to (lsn=697765068, flush_to_disk=true) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1493#2 0x0000000001a51163 in log_buffer_sync_in_background (flush=true) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1553#3 0x0000000001b84bd1 in srv_sync_log_buffer_in_background () at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/srv/srv0srv.cc:2312#4 0x0000000001b85666 in srv_master_do_idle_tasks () at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/srv/srv0srv.cc:2586#5 0x0000000001b85b6b in srv_master_thread (arg=0x0) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/srv/srv0srv.cc:2744
2、master 執行緒checkpoint呼叫 (可能是idle可能是active 和檢測是否需要插入快取合併有關)
#0 log_group_write_buf (group=0x33f29f8, buf=0x7fffa5a38000 "\200\024\002", len=1024, pad_len=0, start_lsn=697789952, new_data_offset=139) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1145#1 0x0000000001a50f95 in log_write_up_to (lsn=697790725, flush_to_disk=true) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1493#2 0x0000000001a52247 in log_checkpoint (sync=true, write_always=false) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1934#3 0x0000000001b856f2 in srv_master_do_idle_tasks () at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/srv/srv0srv.cc:2596#4 0x0000000001b85b6b in srv_master_thread (arg=0x0) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/srv/srv0srv.cc:2744
3、page clean 執行緒呼叫 棧幀 /* Force the log to the disk before writing the modified block */
#0 log_group_write_buf (group=0x33f29f8, buf=0x7fffa5a38000 "\200\024\002", len=13312, pad_len=1024, start_lsn=697778176, new_data_offset=468) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1145#1 0x0000000001a50f95 in log_write_up_to (lsn=697790015, flush_to_disk=true) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1493#2 0x0000000001c704c7 in buf_flush_write_block_low (bpage=0x7fffc0cae940, flush_type=BUF_FLUSH_LIST, sync=false) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/buf/buf0flu.cc:1035#3 0x0000000001c70cea in buf_flush_page (buf_pool=0x33247d8, bpage=0x7fffc0cae940, flush_type=BUF_FLUSH_LIST, sync=false) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/buf/buf0flu.cc:1237#4 0x0000000001c717f4 in buf_flush_try_neighbors (page_id=..., flush_type=BUF_FLUSH_LIST, n_flushed=0, n_to_flush=25) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/buf/buf0flu.cc:1466#5 0x0000000001c71b57 in buf_flush_page_and_try_neighbors (bpage=0x7fffc0cae940, flush_type=BUF_FLUSH_LIST, n_to_flush=25, count=0x7fffa02867c0) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/buf/buf0flu.cc:1558#6 0x0000000001c72862 in buf_do_flush_list_batch (buf_pool=0x33247d8, min_n=25, lsn_limit=18446744073709551615) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/buf/buf0flu.cc:1846#7 0x0000000001c72cb6 in buf_flush_batch (buf_pool=0x33247d8, flush_type=BUF_FLUSH_LIST, min_n=25, lsn_limit=18446744073709551615) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/buf/buf0flu.cc:1926#8 0x0000000001c73104 in buf_flush_do_batch (buf_pool=0x33247d8, type=BUF_FLUSH_LIST, min_n=25, lsn_limit=18446744073709551615, n_processed=0x7fffa0286938) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/buf/buf0flu.cc:2071#9 0x0000000001c734ee in buf_flush_lists (min_n=25, lsn_limit=18446744073709551615, n_processed=0x7fffa02869c8) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/buf/buf0flu.cc:2202#10 0x0000000001c76a97 in buf_flush_page_cleaner_coordinator (arg=0x0) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/buf/buf0flu.cc:3362
4、主執行緒commit 呼叫 棧幀
#0 log_group_write_buf (group=0x33f29f8, buf=0x7fffa5a38000 "\200\024\002", len=2560, pad_len=0, start_lsn=697762816, new_data_offset=230) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1145#1 0x0000000001a50f95 in log_write_up_to (lsn=697765030, flush_to_disk=true) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1493#2 0x0000000001a51087 in log_buffer_flush_to_disk (sync=true) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:1524#3 0x00000000019a9157 in innobase_flush_logs (hton=0x2e9fdd0, binlog_group_flush=true) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:4407#4 0x0000000000f65893 in flush_handlerton (thd=0x0, plugin=0x7ffff03588e8, arg=0x7ffff0358944) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/handler.cc:2606#5 0x00000000015d7716 in plugin_foreach_with_mask (thd=0x0, func=0xf65835 <flush_handlerton(THD*, plugin_ref, void*)>, type=1, state_mask=4294967287, arg=0x7ffff0358944) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_plugin.cc:2318#6 0x0000000000f658ef in ha_flush_logs (db_type=0x0, binlog_group_flush=true) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/handler.cc:2617#7 0x000000000185733d in MYSQL_BIN_LOG::process_flush_stage_queue (this=0x2e02c80, total_bytes_var=0x7ffff0358a88, rotate_var=0x7ffff0358a87, out_queue_var=0x7ffff0358a78) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/binlog.cc:8541#8 0x000000000185899f in MYSQL_BIN_LOG::ordered_commit (this=0x2e02c80, thd=0x7fff2c000b70, all=false, skip_commit=false) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/binlog.cc:9189#9 0x000000000185700c in MYSQL_BIN_LOG::commit (this=0x2e02c80, thd=0x7fff2c000b70, all=false) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/binlog.cc:8440#10 0x0000000000f63df8 in ha_commit_trans (thd=0x7fff2c000b70, all=false, ignore_global_read_lock=false)
入口函式
innobase_start_or_create_for_mysql
recv_recovery_from_checkpoint_start
1、此函式需要輸入flush的lsn及髒資料寫入到的位置
這個資料從FIL_PAGE_FILE_FLUSH_LSN中讀取,這個值
只在ibdata的第一個page有效其他均為0,他的讀取來自於
函式Datafile::validate_first_page,
其中
*flush_lsn = mach_read_from_8(
m_first_page + FIL_PAGE_FILE_FLUSH_LSN);
其寫入由函式 正常shutdown才會寫入,非正常關閉不會寫入,正常執行檢查點也不會寫入 fil_write_flushed_lsn 寫入 mach_write_to_8(buf + FIL_PAGE_FILE_FLUSH_LSN, lsn); 棧幀
#0 fil_write_flushed_lsn (lsn=696973727) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/fil/fil0fil.cc:1946 #1 0x0000000001a538a7 in logs_empty_and_mark_files_at_shutdown () at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/log/log0log.cc:2464 #2 0x0000000001b915a9 in innobase_shutdown_for_mysql () at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/srv/srv0start.cc:2803 #3 0x00000000019a8ffc in innobase_end (hton=0x2e9edd0, type=HA_PANIC_CLOSE) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:4360 #4 0x0000000000f62621 in ha_finalize_handlerton (plugin=0x3015cd0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/handler.cc:813 #5 0x00000000015d3d25 in plugin_deinitialize (plugin=0x3015cd0, ref_check=true) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_plugin.cc:995 #6 0x00000000015d410e in reap_plugins () at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_plugin.cc:1077 #7 0x00000000015d6073 in plugin_shutdown () at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_plugin.cc:1845 #8 0x0000000000ebf7eb in clean_up (print_message=true) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/mysqld.cc:1336 #9 0x0000000000ec701b in mysqld_main (argc=98, argv=0x2e9cf08) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/mysqld.cc:5386 #10 0x0000000000ebd604 in main (argc=10, argv=0x7fffffffe458) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/main.cc:25
2、此函式首先建立紅黑樹用於恢復並且做 force recovery判斷
(srv_force_recovery >= SRV_FORCE_NO_LOG_REDO)
然後呼叫
recv_find_max_checkpoint 此函式就是找到最大的checkpoint lsn、redo檔案,以及field 就是相對於一個日誌而講所在的offset set (512
1 or 512
3)_
讀取如下:
group->lsn = mach_read_from_8( buf + LOG_CHECKPOINT_LSN); group->lsn_offset = mach_read_from_8( buf + LOG_CHECKPOINT_OFFSET); checkpoint_no = mach_read_from_8( buf + LOG_CHECKPOINT_NO);
3、recv_group_scan_log_recs
迴圈讀取64K(RECV_SCAN_SIZE)日誌到redo buffer(log_group_read_log_seg)
recv_scan_log_recs
本函式將64K 的redo 透過每個512 bytes block大小迴圈加入到(recv_sys_add_to_parsing_buf) 掃描到parse buffer
程式碼片段複製
ut_memcpy(recv_sys->buf + recv_sys->len, log_block + start_offset, end_offset - start_offset); 當每次掃描了64K*80的日誌量的時候會輸出,級5M ib::info() << "Doing recovery: scanned up to" " log sequence number " << scanned_lsn;
並且進行分析對這加入到 parse buffer的日誌進行分析(recv_parse_log_recs) /其中parse buffer 為2M(宏RECV_PARSING_BUF_SIZE大小) recv_parse_log_recs函式對每次加入 parse buffer的64k進行分析主要分析是MLOG_SINGLE_REC_FLAG還是MLOG_MULTI_REC_END 同時確認MLOG_MULTI_REC_END 是否完整,完成後加入到hash table,函式是呼叫recv_parse_log_rec進行 (type spaceid page_no data)的剝離。
那麼recv_group_scan_log_recs為第一層迴圈,迴圈掃描64K日誌到redo buffer
然後呼叫recv_scan_log_recs以block單位(512b)大小迴圈加入到parse buffer,這是透過函式recv_sys_add_to_parsing_buf完成
等到64k加入到parse buffer後呼叫recv_parse_log_recs函式對每次加入 parse buffer的64k進行分析,分析是迴圈以record為單位的,主要分析MLOG_SINGLE_REC_FLAG還是MLOG_MULTI_REC_END,同時確保MLOG_MULTI_REC_END記錄是完整的。完成後recv_parse_log_recs函式 還要將其加入到hash table函式為recv_add_to_hash_table,本函式是透過recv_parse_log_rec進行 (type spaceid page_no data)的剝離。
recv_apply_hashed_log_recs 最後完成分析完成後的日誌應用,從hash table中
作者微信:
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7728585/viewspace-2212904/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- mysql學習筆記之備份與恢復MySql筆記
- Mysql 企業級備份與恢復(學習筆記七)MySql筆記
- 《MySQL實戰45講》學習筆記4——MySQL中InnoDB的索引MySql筆記索引
- MySQL密碼恢復筆記MySql密碼筆記
- 【MySQL】Innodb 恢復工具介紹MySql
- innodb學習筆記(一) aio的使用筆記AI
- 恢復MySQL InnoDB表結構的方法MySql
- rman資料備份恢復學習筆記筆記
- 【MySQL】二、Innodb 恢復工具介紹MySql
- MySQL 引擎特性:InnoDB崩潰恢復MySql
- INNOBACKUPEX的全備和增量備份恢復學習筆記筆記
- Mysql技術內幕InnoDB儲存引擎讀書筆記--《八》備份與恢復MySql儲存引擎筆記
- MySql學習筆記MySql筆記
- 【記錄】MySQL 學習筆記MySql筆記
- [記錄] MySQL 學習筆記MySql筆記
- MySQL學習筆記2MySql筆記
- MySQL學習筆記:鎖MySql筆記
- MySql學習筆記06MySql筆記
- 學習Mysql筆記(一)MySql筆記
- mysql學習筆記-1MySql筆記
- mysql學習筆記-2MySql筆記
- MySQL Events學習筆記MySql筆記
- mysql學習筆記3MySql筆記
- 學習筆記:InnoDB表和索引結構(二)筆記索引
- 學習筆記:InnoDB表和索引結構(一)筆記索引
- MySQL異常恢復之無主鍵情況下innodb資料恢復的方法MySql資料恢復
- mysql8.0學習筆記MySql筆記
- MySQL學習筆記之一MySql筆記
- MySQL學習筆記:索引失效MySql筆記索引
- MySql 學習筆記二:索引MySql筆記索引
- MySQL學習筆記--基本操作MySql筆記
- mysql優化學習筆記MySql優化筆記
- MySql 學習筆記四:表的拆分MySql筆記
- 學習筆記:MySQL Cluster 的binlog筆記MySql
- numpy的學習筆記\pandas學習筆記筆記
- MySQL學習筆記【基礎篇】MySql筆記
- 一千行 MySQL 學習筆記MySql筆記
- Mysql學習筆記(安裝篇)MySql筆記