Innodb undo之 undo物理結構的初始化
水平有限,如果有誤請指出。
一直以來未對Innodb 的undo進行好好的學習,最近剛好有點時間準備學習一下,通過阿里核心月報和自己看程式碼的綜合總結一下。本文環境:
- 程式碼版本 percona 5.7.22
- 引數 innodb_undo_tablespaces = 4 及使用了4個undo tablespace
- 引數 innodb_rollback_segments = 128
本文描述使用如上引數的設定。
一、undo 表空間物理檔案的建立
本過程呼叫函式srv_undo_tablespaces_init進行,棧幀如下:
#0 srv_undo_tablespaces_init (create_new_db=true, n_conf_tablespaces=4, n_opened=0x2ef55b0) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/srv/srv0start.cc:824#1 0x0000000001bbd7e0 in innobase_start_or_create_for_mysql () at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/srv/srv0start.cc:2188#2 0x00000000019ca74e in innobase_init (p=0x2f2a420) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/handler/ha_innodb.cc:4409#3 0x0000000000f7ec2a in ha_initialize_handlerton (plugin=0x2fca110) at /root/mysqlc/percona-server-locks-detail-5.7.22/sql/handler.cc:871#4 0x00000000015f9edf in plugin_initialize (plugin=0x2fca110) at /root/mysqlc/percona-server-locks-detail-5.7.22/sql/sql_plugin.cc:1252
本過程主要有如下幾個步驟:
- 根據引數innodb_undo_tablespaces 的配置通過呼叫srv_undo_tablespace_create分別進行檔案建立,預設建立的大小為10M:
for (i = 0; create_new_db && i < n_conf_tablespaces; ++i) //n_conf_tablespaces 為innodb_undo_tablespaces的配置的個數/** Default undo tablespace size in UNIV_PAGEs count (10MB). */const ulint SRV_UNDO_TABLESPACE_SIZE_IN_PAGES = ((1024 * 1024) * 10) / UNIV_PAGE_SIZE_DEF; ... err = srv_undo_tablespace_create( name, SRV_UNDO_TABLESPACE_SIZE_IN_PAGES); //建立undo檔案...
本步驟會有一個註釋如下:
/* Create the undo spaces only if we are creating a new instance. We don't allow creating of new undo tablespaces in an existing instance (yet). This restriction exists because we check in several places for SYSTEM tablespaces to be less than the min of user defined tablespace ids. Once we implement saving the location of the undo tablespaces and their space ids this restriction will/should be lifted. */
簡單的講就是建立undo tablespace只能在初始化例項的時候,因為space id已經固定了。
- 分別對4個undo tablespace呼叫srv_undo_tablespace_open 其主要呼叫fil_space_create 和 fil_node_create將新建立的undo tablespace加入Innodb的檔案體系。
for (i = 0; i < n_undo_tablespaces; ++i) { .... err = srv_undo_tablespace_open(name, undo_tablespace_ids[i]); //開啟UNDO檔案 建立 file node... }
- 分別對4個undo tablespace 進行fsp header初始化
for (i = 0; i < n_undo_tablespaces; ++i) { fsp_header_init( //初始化fsp header 明顯 space id 已經寫入 undo_tablespace_ids[i], SRV_UNDO_TABLESPACE_SIZE_IN_PAGES, &mtr); //SRV_UNDO_TABLESPACE_SIZE_IN_PAGES 預設的undo大小 10MB }
其中fsp_header_init部分程式碼如下:
mlog_write_ulint(header + FSP_SPACE_ID, space_id, MLOG_4BYTES, mtr); mlog_write_ulint(header + FSP_NOT_USED, 0, MLOG_4BYTES, mtr); mlog_write_ulint(header + FSP_SIZE, size, MLOG_4BYTES, mtr); mlog_write_ulint(header + FSP_FREE_LIMIT, 0, MLOG_4BYTES, mtr); mlog_write_ulint(header + FSP_SPACE_FLAGS, space->flags, MLOG_4BYTES, mtr); mlog_write_ulint(header + FSP_FRAG_N_USED, 0, MLOG_4BYTES, mtr); flst_init(header + FSP_FREE, mtr); flst_init(header + FSP_FREE_FRAG, mtr); flst_init(header + FSP_FULL_FRAG, mtr); flst_init(header + FSP_SEG_INODES_FULL, mtr); flst_init(header + FSP_SEG_INODES_FREE, mtr);
這些都是fsp的內容。
做完這個步驟只是生成了4個大小為10MB的 undo tablespace檔案,並且已經加入到Innodb檔案體系,但是裡面沒有任何類容。
二、ibdata中system segment header的初始化
本步驟呼叫 trx_sys_create_sys_pages->trx_sysf_create進行,本步驟除了初始化transaction system segment以外還會初始化其header( ibdata page no 5))資訊如下:
/* Create the trx sys file block in a new allocated file segment */ block = fseg_create(TRX_SYS_SPACE, 0, TRX_SYS + TRX_SYS_FSEG_HEADER, mtr); //建立segment buf_block_dbg_add_level(block, SYNC_TRX_SYS_HEADER); ut_a(block->page.id.page_no() == TRX_SYS_PAGE_NO); page = buf_block_get_frame(block); //獲取記憶體位置 mlog_write_ulint(page + FIL_PAGE_TYPE, FIL_PAGE_TYPE_TRX_SYS, //寫入block 的型別 MLOG_2BYTES, mtr); ... /* Start counting transaction ids from number 1 up */ mach_write_to_8(sys_header + TRX_SYS_TRX_ID_STORE, 1); // 初始化TRX_SYS_TRX_ID_STORE /* Reset the rollback segment slots. Old versions of InnoDB define TRX_SYS_N_RSEGS as 256 (TRX_SYS_OLD_N_RSEGS) and expect that the whole array is initialized. */ ptr = TRX_SYS_RSEGS + sys_header; len = ut_max(TRX_SYS_OLD_N_RSEGS, TRX_SYS_N_RSEGS) * TRX_SYS_RSEG_SLOT_SIZE;//TRX_SYS_OLD_N_RSEGS 為256個 memset(ptr, 0xff, len); //將slot的資訊的全部初始化為ff ptr += len; ut_a(ptr <= page + (UNIV_PAGE_SIZE - FIL_PAGE_DATA_END)); /* Initialize all of the page. This part used to be uninitialized. */ memset(ptr, 0, UNIV_PAGE_SIZE - FIL_PAGE_DATA_END + page - ptr); //將剩下的空間設定為0x00 mlog_log_string(sys_header, UNIV_PAGE_SIZE - FIL_PAGE_DATA_END + page - sys_header, mtr); /* Create the first rollback segment in the SYSTEM tablespace */ slot_no = trx_sysf_rseg_find_free(mtr, false, 0); page_no = trx_rseg_header_create(TRX_SYS_SPACE, univ_page_size, ULINT_MAX, slot_no, mtr); //將第一個slot固定在ibdata中
完成了這一步過後ibdata的 block 5 就初始化完了,而且我們看到所有的rollback segment slots 都初始化完成(原始碼所示有256個,實際上最多隻會有128個,其中0號solt固定在ibdata中),注意這裡的槽大小是TRX_SYS_RSEG_SLOT_SIZE設定的大小為8位元組,4位元組space id ,4位元組 page no,它們會指向 rollback segment header所在的位置。
- 下面是system segment header的定位:
/** Transaction system header *//*------------------------------------------------------------- @{ */#define TRX_SYS_TRX_ID_STORE 0 /*!< the maximum trx id or trx number modulo TRX_SYS_TRX_ID_UPDATE_MARGIN written to a file page by any transaction; the assignment of transaction ids continues from this number rounded up by TRX_SYS_TRX_ID_UPDATE_MARGIN plus TRX_SYS_TRX_ID_UPDATE_MARGIN when the database is started */ //最大的事物ID,下次例項啟動會加上TRX_SYS_TRX_ID_UPDATE_MARGIN啟動#define TRX_SYS_FSEG_HEADER 8 /*!< segment header for the tablespace segment the trx system is created into */#define TRX_SYS_RSEGS (8 + FSEG_HEADER_SIZE) /*!< the start of the array of rollback segment specification slots *///指向rollback segment header的槽/*------------------------------------------------------------- @} */
三、進行rollback segment header的初始化
呼叫 trx_sys_create_rsegs進行:
- 說明一下關於innodb_undo_logs引數和innodb_rollback_segments引數,他們作用就是設定rollback segment 的個數,本文以128為例。
根據註釋和程式碼innodb_undo_logs已經是個淘汰的引數,應該用innodb_rollback_segments代替。
這兩個引數預設是就是TRX_SYS_N_RSEGS及 128 其實不用設定的。本文也用128進行討論。
引數 innodb_rollback_segments
static MYSQL_SYSVAR_ULONG(rollback_segments, srv_rollback_segments, PLUGIN_VAR_OPCMDARG, "Number of rollback segments to use for storing undo logs.", NULL, NULL, TRX_SYS_N_RSEGS, /* Default setting */ 1, /* Minimum value */ TRX_SYS_N_RSEGS, 0); /* Maximum value */
引數 innodb_undo_logs
static MYSQL_SYSVAR_ULONG(undo_logs, srv_undo_logs, PLUGIN_VAR_OPCMDARG, "Number of rollback segments to use for storing undo logs. (deprecated)", NULL, innodb_undo_logs_update, TRX_SYS_N_RSEGS, /* Default setting */ 1, /* Minimum value */ TRX_SYS_N_RSEGS, 0); /* Maximum value */
TRX_SYS_N_RSEGS 就是128
下面是註釋和程式碼
/* Deprecate innodb_undo_logs. But still use it if it is set to non-default and innodb_rollback_segments is default. */ if (srv_undo_logs < TRX_SYS_N_RSEGS) { ib::warn() << deprecated_undo_logs; if (srv_rollback_segments == TRX_SYS_N_RSEGS) { srv_rollback_segments = srv_undo_logs; } }
- 初始化rollback segments 段
n_noredo_created = trx_sys_create_noredo_rsegs(n_tmp_rsegs); //建立 32個 臨時rollback segments
我們這裡不準備考慮臨時rollback segments
- 建立 95個(33-128) 普通rollback segments
ulint new_rsegs = n_rsegs - n_used; //eg:128 -33 = 95 for (i = 0; i < new_rsegs; ++i) { //對每個rollback segment進行初始化 ulint space_id; space_id = (n_spaces == 0) ? 0 : (srv_undo_space_id_start + i % n_spaces); //獲取 undo space_id 採用 取模的方式迴圈初始化 1 2 3 4 ut_ad(n_spaces == 0 || srv_is_undo_tablespace(space_id)); if (trx_rseg_create(space_id, 0) != NULL)
我們能夠注意到這裡是i % n_spaces的取模方式n_spaces為我們innodb_undo_tablespaces引數設定的值,因此每個rollback segment 是輪序的方式分佈到4個不同的undo tablespace中的。
- 具體的rollback segment header初始化過程
如上是trx_rseg_create呼叫trx_rseg_header_create完成的。步驟大概如下:
1、建立rollback segment
block = fseg_create(space, 0, TRX_RSEG + TRX_RSEG_FSEG_HEADER, mtr); //建立一個回滾段,返回段頭所在的塊
2、初始化TRX_RSEG_MAX_SIZE和TRX_RSEG_HISTORY_SIZE資訊
/* Initialize max size field */ mlog_write_ulint(rsegf + TRX_RSEG_MAX_SIZE, max_size, MLOG_4BYTES, mtr); /* Initialize the history list */ mlog_write_ulint(rsegf + TRX_RSEG_HISTORY_SIZE, 0, MLOG_4BYTES, mtr); flst_init(rsegf + TRX_RSEG_HISTORY, mtr);
3、初始化每個undo segment header所在的page no
for (i = 0; i < TRX_RSEG_N_SLOTS; i++) { //TRX_RSEG_N_SLOTS 為1024 初始化每個槽 值為 4位元組指向 undo segment header的page no trx_rsegf_set_nth_undo(rsegf, i, FIL_NULL, mtr); }
初始化的情況下我們看到指向的page no都是 FIL_NULL,說明沒有分配任何實際的undo segment。
4、整個rollback segment 初始化完成後將space id和page no 寫回到 transaction system segment header中。
sys_header = trx_sysf_get(mtr); //獲取 5號 block指標 跳過 FIL_PAGE_DATA 38U trx_sysf_rseg_set_space(sys_header, rseg_slot_no, space, mtr); //設定spacetrx_sysf_rseg_set_page_no(sys_header, rseg_slot_no, page_no, mtr); //設定 no
- 下面是 rollback segment header的結構
/* Transaction rollback segment header *//*-------------------------------------------------------------*/#define TRX_RSEG_MAX_SIZE 0 /* Maximum allowed size for rollback segment in pages */#define TRX_RSEG_HISTORY_SIZE 4 /* Number of file pages occupied by the logs in the history list */ //history 連結串列大小#define TRX_RSEG_HISTORY 8 /* The update undo logs for committed transactions */ //連結串列頭base node 他們通常呼叫include/fut0lst.ic中的函式進行更改#define TRX_RSEG_FSEG_HEADER (8 + FLST_BASE_NODE_SIZE) /* Header for the file segment where this page is placed */#define TRX_RSEG_UNDO_SLOTS (8 + FLST_BASE_NODE_SIZE + FSEG_HEADER_SIZE) /* Undo log segment slots */ ///*-------------------------------------------------------------*/
作為 base node的 TRX_RSEG_HISTORY我們可以看到定義如下
/* We define the field offsets of a base node for the list */#define FLST_LEN 0 /* 32-bit list length field */#define FLST_FIRST 4 /* 6-byte address of the first element of the list; undefined if empty list */#define FLST_LAST (4 + FIL_ADDR_SIZE) /* 6-byte address of the last element of the list; undefined if empty list */#define FIL_ADDR_PAGE 0 /* first in address is the page offset */#define FIL_ADDR_BYTE 4 /* then comes 2-byte byte offset within page*/#endif /* !UNIV_INNOCHECKSUM */#define FIL_ADDR_SIZE 6 /* address size is 6 bytes */
多了一個長度
到這裡128 rollback segment已經初始化完成,並且 每個都包含1024個 undo segment slots。
四、整個過程初始化完成後的分佈圖
為了讓圖更加美觀和好理解,我這裡使用的是innodb_undo_tablespaces=2的情況下作圖,也就是隻有2個 undo tablespace的情況。其實4個也是同樣的道理,因為rollback segment slot是輪詢在表空間分配的。
最終我們看到初始化完成後undo segment slot指向的都是FIL_NULL,及沒有指向,當實際分配的時候這些slot就會指向我們的undo segment header。
同時我們可以看看undotablespace到底包含哪些型別塊,使用自制的小工具讀取如下:
./myblock undo001 -d|more current read blocks is : 0 --This Block is file space header blocks! current read blocks is : 1 --This Block is insert buffer bitmap blocks! current read blocks is : 2 --This Block is inode blocks! current read blocks is : 3 --This Block is system blocks! current read blocks is : 4 --This Block is system blocks! current read blocks is : 5 --This Block is system blocks! current read blocks is : 6 --This Block is system blocks! current read blocks is : 7 --This Block is system blocks! current read blocks is : 8 --This Block is system blocks! current read blocks is : 9 --This Block is system blocks! current read blocks is : 10 --This Block is system blocks! current read blocks is : 11 --This Block is system blocks! current read blocks is : 12 --This Block is system blocks! current read blocks is : 13 --This Block is system blocks! current read blocks is : 14 --This Block is system blocks! current read blocks is : 15 --This Block is system blocks! current read blocks is : 16 --This Block is system blocks! current read blocks is : 17 --This Block is system blocks! current read blocks is : 18 --This Block is system blocks! current read blocks is : 19 --This Block is system blocks! current read blocks is : 20 --This Block is system blocks! current read blocks is : 21 --This Block is system blocks! current read blocks is : 22 --This Block is system blocks! current read blocks is : 23 --This Block is system blocks! current read blocks is : 24 --This Block is system blocks! current read blocks is : 25 --This Block is system blocks! current read blocks is : 26 --This Block is system blocks! current read blocks is : 27 --This Block is undo blocks! current read blocks is : 28 --This Block is undo blocks! current read blocks is : 29 --This Block is undo blocks! current read blocks is : 30 --This Block is undo blocks! current read blocks is : 31 --This Block is undo blocks! current read blocks is : 32 --This Block is undo blocks! current read blocks is : 33 --This Block is undo blocks! current read blocks is : 34 --This Block is undo blocks! current read blocks is : 35 --This Block is undo blocks! current read blocks is : 36 --This Block is undo blocks! current read blocks is : 37 --This Block is undo blocks! current read blocks is : 38 --This Block is new allocate blocks! current read blocks is : 39 --This Block is new allocate blocks! current read blocks is : 40 --This Block is new allocate blocks! current read blocks is : 41 --This Block is new allocate blocks! current read blocks is : 42 --This Block is new allocate blocks!
這裡 block3-block26 就是我們的rollback segment header block。我這裡當然是 4個undo tablespace的情況,看的是undo tablespace 1。看來沒有問題。分析正確。
五、總結
-
普通的undo segment的關聯方式是:ibdata的block 5 system segment header通過33-128這些 rollback segment slot 輪詢指向不同的undo tablespace 的rollback segment header,然後每個rollback segment header中有1024個slot來指向實際的undo segment header,來實現的。實際的undo block會掛載到undo segment header下的連結串列中。
-
undo tablespaces數量的變化只能通過重新初始化例項來改變,space id是固定了,所以要考慮清楚
-
innodb_undo_tablespaces是undo tablespace的數量而innodb_rollback_segments是 rollback segment的數量,引數innodb_undo_logs已經過時了,它和innodb_rollback_segments是同樣的功能,預設他們都是128
-
rollback segment slot 0 固定在 ibdata中,而 rollback segment slot 1-32 為臨時rollback segment,33-128才是普通事物的rollback segment。
參考文獻:
http://mysql.taobao.org/monthly/2015/04/01/
阿里核心月報
作者微信:gp_22389860
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7728585/viewspace-2565019/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Innodb undo之 undo結構簡析
- InnoDB undo log原理
- Innodb:Undo 表空間巨大
- InnoDB文件筆記(三)—— Undo Log筆記
- MySQL InnoDB Undo表空間配置MySql
- InnoDB purge原理--哪些undo log可purge
- 2.6.8.2 UNDO_TABLESPACE 初始化引數
- 2.6.8.1 UNDO_MANAGEMENT 初始化引數
- 28、undo_1_2(undo引數、undo段、事務)
- Oracle 12c 新特性之臨時Undo--temp_undo_enabledOracle
- innodb_undo_tablespaces導致Mysql啟動報錯MySql
- MySQL undoMySql
- Oracle Redo and UndoOracle Redo
- 2.5.5 使用自動Undo管理: 建立 Undo 表空間
- oracle的redo和undoOracle
- undo_retention的作用
- 切換UNDO(zt)
- MySQL purge 清理undoMySql
- Canvas圖形編輯器-資料結構與History(undo/redo)Canvas資料結構
- 關於oracle中的undoOracle
- undo表空間容量
- undo log和redo log
- 4.2.1.9 選擇 Undo 模式模式
- oracle undo分配規則Oracle
- Oracle OCP(48):UNDO TABLESPACEOracle
- Oracle 12C R2新特性-本地UNDO模式(LOCAL_UNDO_ENABLED)Oracle模式
- Sqlserver沒有單獨的undo檔案,使用tempdb和redo log來存放undo資料SQLServer
- 深入理解MYSQL undo redoMySql
- Oracle常見UNDO等待事件Oracle事件
- 4.3.2.4 關於CDB UNDO模式模式
- 更改undo表空間大小
- bbed修改undo段狀態
- 【REDO】Oracle redo undo 學習Oracle Redo
- SQLServer的檢查點、redo和undoSQLServer
- MySQL中的redo log和undo logMySql
- 【UNDO】Oracle undo表空間使用率過高,因為一個查詢Oracle
- 深入理解MySQL系列之redo log、undo log和binlogMySql
- HBase學習之Hbase的邏輯結構和物理結構