Innodb undo之 undo結構簡析
水平有限,如果有誤請指出
參考:
- 阿里核心月報
- 姜老師的MySQL核心:innodb儲存引擎
做一個簡單的記錄,自己備用
一、大體結構
rollback segments(128)
undo segments(1024)
undo log (header insert/modify 分開的) <-> undo page
undo record
undo record
作為undo segments的第一個undo page可以存放多個事物的undo log,因為如果這個塊的undo 記錄沒有填滿3/4則會進入 rollback segment的cache list,那麼下次可以繼續使用,但是如果第一個塊不足以裝下事物的undo 記錄,那麼很顯然需要分配新的undo page,這種情況下一個undo page就只能包含一個事物的undo記錄了。
事物每次需要分配rollback segments然後分配undo segments然後初始化好undo log header,insert和update/delete需要分配不同的undo segments,一個undo segments往往對應了一個undo log,undo log可以包含多個undo record(因為從debug來看undo log header的初始化只做了一次),對於操作的每行都會留下一個undo record作為mvcc構建歷史版本的基礎。
undo生成的基本單位是undo record,每行記錄都會包含一個undo record,而rollback ptr指向的是undo record的偏移量,對於每行的記錄都會去判斷其可見性,如果需要構建前版本則透過本指標進行構建包含:
- 第1位是否是insert 第2到8位是undo segment id 第9到40位為page no 第41位到56位為 offset
每一個undo log包含一個trx_undo_t結構體
每一個rollback segments包含一個trx_rseg_t結構體
二、物理結構
- undo page header 每一個undo page都包含
/** Transaction undo log page header offsets *//* @{ */#define TRX_UNDO_PAGE_TYPE 0 /*!< TRX_UNDO_INSERT or TRX_UNDO_UPDATE */#define TRX_UNDO_PAGE_START 2 /*!< Byte offset where the undo log records for the LATEST transaction start on this page (remember that in an update undo log, the first page can contain several undo logs) */#define TRX_UNDO_PAGE_FREE 4 /*!< On each page of the undo log this field contains the byte offset of the first free byte on the page */#define TRX_UNDO_PAGE_NODE 6 /*!< The file list node in the chain of undo log pages */
- undo semgent header 第一個page 才會用 undo segment header資訊
#define TRX_UNDO_STATE 0 /*!< TRX_UNDO_ACTIVE, ... */#ifndef UNIV_INNOCHECKSUM#define TRX_UNDO_LAST_LOG 2 /*!< Offset of the last undo log header on the segment header page, 0 if none */#define TRX_UNDO_FSEG_HEADER 4 /*!< Header for the file segment which the undo log segment occupies */#define TRX_UNDO_PAGE_LIST (4 + FSEG_HEADER_SIZE) /*!< Base node for the list of pages in the undo log segment; defined only on the undo log segment's first page */
-
每一個undo log
- undo log header
- undo log record 相應的undo實際內容
- undo log record 相應的undo實際內容
undo log header 包含
#define TRX_UNDO_TRX_ID 0 /*!< Transaction id */#define TRX_UNDO_TRX_NO 8 /*!< Transaction number of the transaction; defined only if the log is in a history list */#define TRX_UNDO_DEL_MARKS 16 /*!< Defined only in an update undo log: TRUE if the transaction may have done delete markings of records, and thus purge is necessary */#define TRX_UNDO_LOG_START 18 /*!< Offset of the first undo log record of this log on the header page; purge may remove undo log record from the log start, and therefore this is not necessarily the same as this log header end offset */#define TRX_UNDO_XID_EXISTS 20 /*!< TRUE if undo log header includes X/Open XA transaction identification XID */#define TRX_UNDO_DICT_TRANS 21 /*!< TRUE if the transaction is a table create, index create, or drop transaction: in recovery the transaction cannot be rolled back in the usual way: a 'rollback' rather means dropping the created or dropped table, if it still exists */#define TRX_UNDO_TABLE_ID 22 /*!< Id of the table if the preceding field is TRUE */#define TRX_UNDO_NEXT_LOG 30 /*!< Offset of the next undo log header on this page, 0 if none */#define TRX_UNDO_PREV_LOG 32 /*!< Offset of the previous undo log header on this page, 0 if none */#define TRX_UNDO_HISTORY_NODE 34 /*!< If the log is put to the history list, the file list node is here */
三、分配步驟和寫入
- 第一步為 分配rollback segments
#0 get_next_redo_rseg (max_undo_logs=128, n_tablespaces=4) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0trx.cc:1138#1 0x0000000001c0bce8 in trx_assign_rseg_low (max_undo_logs=128, n_tablespaces=4, rseg_type=TRX_RSEG_TYPE_REDO) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0trx.cc:1314#2 0x0000000001c1097d in trx_set_rw_mode (trx=0x7fffd7804080) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0trx.cc:3352#3 0x0000000001a64013 in lock_table (flags=0, table=0x7ffeac012ae0, mode=LOCK_IX, thr=0x7ffe7c92ef48) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/lock/lock0lock.cc:4139#4 0x0000000001b7950e in row_search_mvcc (buf=0x7ffe7c92e350 "\377", mode=PAGE_CUR_GE, prebuilt=0x7ffe7c92e7d0, match_mode=1, direction=0) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0sel.cc:5100#5 0x00000000019d5443 in ha_innobase::index_read (this=0x7ffe7c92de10, buf=0x7ffe7c92e350 "\377", key_ptr=0x7ffe7cd57590 "\004", key_len=4, find_flag=HA_READ_KEY_EXACT) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/handler/ha_innodb.cc:9536#6 0x0000000000f9345a in handler::index_read_map (this=0x7ffe7c92de10, buf=0x7ffe7c92e350 "\377", key=0x7ffe7cd57590 "\004", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at /root/mysqlc/percona-server-locks-detail-5.7.22/sql/handler.h:2942#7 0x0000000000f83e44 in handler::ha_index_read_map (this=0x7ffe7c92de10, buf=0x7ffe7c92e350 "\377", key=0x7ffe7cd57590 "\004", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at /root/mysqlc/percona-server-locks-detail-5.7.22/sql/handler.cc:3248
第二步 對於主鍵每行更改操作都會呼叫trx_undo_report_row_operation 他會分配undo segments 並且會負責寫入undo record
#0 trx_undo_report_row_operation (flags=0, op_type=2, thr=0x7ffe7c932828, index=0x7ffea4016590, clust_entry=0x7ffe7c932cc0, update=0x0, cmpl_info=0, rec=0x7fffb580d369 "", offsets=0x7fffec0f3e00, roll_ptr=0x7fffec0f3688) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/trx/trx0rec.cc:1866#1 0x0000000001c5795b in btr_cur_del_mark_set_clust_rec (flags=0, block=0x7fffb4ccaae0, rec=0x7fffb580d369 "", index=0x7ffea4016590, offsets=0x7fffec0f3e00, thr=0x7ffe7c932828, entry=0x7ffe7c932cc0, mtr=0x7fffec0f38f0) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/btr/btr0cur.cc:4894#2 0x0000000001b9f218 in row_upd_del_mark_clust_rec (flags=0, node=0x7ffe7c932550, index=0x7ffea4016590, offsets=0x7fffec0f3e00, thr=0x7ffe7c932828, referenced=0, mtr=0x7fffec0f38f0) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:2778#3 0x0000000001b9f765 in row_upd_clust_step (node=0x7ffe7c932550, thr=0x7ffe7c932828) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:2923#4 0x0000000001b9fc74 in row_upd (node=0x7ffe7c932550, thr=0x7ffe7c932828) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:3042#5 0x0000000001ba0155 in row_upd_step (thr=0x7ffe7c932828) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0upd.cc:3188#6 0x0000000001b3d3a0 in row_update_for_mysql_using_upd_graph (mysql_rec=0x7ffe7c9318d0 "\375\001", prebuilt=0x7ffe7c931d50) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0mysql.cc:3040#7 0x0000000001b3d6a1 in row_update_for_mysql (mysql_rec=0x7ffe7c9318d0 "\375\001", prebuilt=0x7ffe7c931d50) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/row/row0mysql.cc:3131#8 0x00000000019d47c3 in ha_innobase::delete_row (this=0x7ffe7c931390, record=0x7ffe7c9318d0 "\375\001") at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/handler/ha_innodb.cc:9141
大概流程
switch (op_type) { case TRX_UNDO_INSERT_OP: undo = undo_ptr->insert_undo; //如果是 insert 則使用insert_undo 型別為trx_undo_t 指標 if (undo == NULL) { //如果已經分配了就不用分配了 err = trx_undo_assign_undo( //分配undo segment 同時初始化 undo log header trx, undo_ptr, TRX_UNDO_INSERT); undo = undo_ptr->insert_undo; ... } break; default: ut_ad(op_type == TRX_UNDO_MODIFY_OP); //斷言 undo = undo_ptr->update_undo; if (undo == NULL) { err = trx_undo_assign_undo( trx, undo_ptr, TRX_UNDO_UPDATE); //分配undo segment 同時初始化 undo log header undo = undo_ptr->update_undo; ... } ... case TRX_UNDO_INSERT_OP://注意是每行都會操作 offset = trx_undo_page_report_insert( //寫入insert undo log record undo_page, trx, index, clust_entry, &mtr); break; default: ut_ad(op_type == TRX_UNDO_MODIFY_OP); //寫入delete update undo log record offset = trx_undo_page_report_modify( undo_page, trx, index, rec, offsets, update, cmpl_info, clust_entry, &mtr); } ... *roll_ptr = trx_undo_build_roll_ptr( //構建rollback ptr 主鍵中每行都有這個 用於MVCC構建回滾版本 op_type == TRX_UNDO_INSERT_OP, undo_ptr->rseg->id, page_no, offset);
四、分解undo log record
我將undo log record的寫入到了錯誤日誌,下面進行簡單的分解。
表結構如下:
mysql> show create table t1; +-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ | t1 | CREATE TABLE `t1` ( `id1` int(11) NOT NULL, `id2` int(11) DEFAULT NULL, PRIMARY KEY (`id1`), KEY `id2` (`id2`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | +-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+1 row in set (0.00 sec)
-
insert 的undo記錄,具體構造在trx_undo_page_report_insert中
語句
mysql> insert into t1 values(28,28); Query OK, 1 row affected (0.00 sec)
輸出如下:
trx_undo_assign_undo:assign undo space: RSEG SLOT:34,RSEG SPACE ID:2 PAGE NO:3UNDO SLOT:0,UNDO SPACE ID:2 UNDO LOG HEADER PAGE NO:27,UNDO LOG HEADER OFFSET:86,UNDO LOG LAST PAGE:27trx_undo_page_report_insert:undo log record TABLE_NAME:test/t1 TRX_ID:12591,UODO RECORD LEN:10 len 10; hex 011e0b0032048000001c;
011e0b0032048000001c就是undo record的實際記錄解析如下:
011c page內部本undo record結束的位置0b 型別為 #define TRX_UNDO_INSERT_REC 11(0X0b)00 undo no,提交才會有32 table_id 可以查詢 INNODB_SYS_TABLES 對照04 欄位長度4個位元組8000001c 我插入的記錄主鍵 28(0X1c)
-
update 的undo記錄,具體構造在trx_undo_page_report_modify中
語句:
mysql> update t1 set id2=1000 where id1=14; Query OK, 1 row affected (5 min 40.91 sec) Rows matched: 1 Changed: 1 Warnings: 0
輸出如下:
trx_undo_assign_undo:assign undo space: RSEG SLOT:41,RSEG SPACE ID:1 PAGE NO:5UNDO SLOT:1,UNDO SPACE ID:1 UNDO LOG HEADER PAGE NO:37,UNDO LOG HEADER OFFSET:1389,UNDO LOG LAST PAGE:37trx_undo_page_report_modify:undo log record TABLE_NAME:test/t1 TRX_ID:12604,UODO RECORD LEN:47 len 47; hex 06560c0032000000003136e0260000002c052e048000000e010304800003e7000e00048000000e0304800003e70627;
06560c0032000000003136e0260000002c052e048000000e010304800003e7000e00048000000e030480
就是undo record的記錄
大體解析如下:
0656 :page內部本undo record結束的位置 0c:型別為 #define TRX_UNDO_UPD_EXIST_REC 12(0X0c) 00: undo no,提交才會有 32: table_id 可以查詢 INNODB_SYS_TABLES 對照 00: 0000003136e0:事物ID260000002c052e:undo回滾指標 04:主鍵長度 8000000e:主鍵值 01 03:位置 04:被修改值的長度 800003e7:值為999(0x3e7) 000e:接下來字元的長度,記錄原始值? 00:位置 04:長度 8000000e:主鍵值 03:位置 04:長度 800003e7:值為999(0x3e7) 0627:page內部本undo record開始的位置,0X0656-0X0627就是長度
作者微信:gp_22389860
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7728585/viewspace-2636904/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Innodb undo之 undo物理結構的初始化
- InnoDB undo log原理
- Innodb:Undo 表空間巨大
- InnoDB文件筆記(三)—— Undo Log筆記
- MySQL InnoDB Undo表空間配置MySql
- InnoDB purge原理--哪些undo log可purge
- 28、undo_1_2(undo引數、undo段、事務)
- Oracle 12c 新特性之臨時Undo--temp_undo_enabledOracle
- innodb_undo_tablespaces導致Mysql啟動報錯MySql
- MySQL undoMySql
- 淺析MySQL事務中的redo與undoMySql
- Oracle Redo and UndoOracle Redo
- 2.5.5 使用自動Undo管理: 建立 Undo 表空間
- MySQL必知必會:簡介undo log、truncate、以及undo log如何幫你回滾事物MySql
- 切換UNDO(zt)
- MySQL purge 清理undoMySql
- Canvas圖形編輯器-資料結構與History(undo/redo)Canvas資料結構
- undo表空間容量
- undo log和redo log
- 4.2.1.9 選擇 Undo 模式模式
- oracle undo分配規則Oracle
- Oracle OCP(48):UNDO TABLESPACEOracle
- oracle的redo和undoOracle
- undo_retention的作用
- Oracle 12C R2新特性-本地UNDO模式(LOCAL_UNDO_ENABLED)Oracle模式
- 深入理解MYSQL undo redoMySql
- Oracle常見UNDO等待事件Oracle事件
- 4.3.2.4 關於CDB UNDO模式模式
- 更改undo表空間大小
- 關於oracle中的undoOracle
- bbed修改undo段狀態
- 【REDO】Oracle redo undo 學習Oracle Redo
- 【UNDO】Oracle undo表空間使用率過高,因為一個查詢Oracle
- 深入理解MySQL系列之redo log、undo log和binlogMySql
- InnoDB學習(七)之索引結構索引
- MySQL redo與undo日誌解析MySql
- undo truncate 導致qps下降分析
- Oracle 面試寶典-UNDO篇Oracle面試