PostgreSQL DBA(19) - REDO point淺析
What's REDO point?
REDO point is the point PostgreSQL start to recover from;that is, the location to write the XLOG record at the moment when the latest checkpoint is started.
REDO point是PostgreSQL啟動恢復的起始點,也就是說REDO point是最後一次checkpoint啟動時寫XLOG Record的位置.
一、REDO point的儲存
checkpointer程式啟動時,從pg_control檔案中獲取Redo point並儲存在記憶體中,在執行checkpoint時更新Redo point為當前即將寫XLOG Record的位置,checkpoint執行成功後把Redo point更新到pg_control檔案中.
類似的,資料庫啟動需要執行恢復時,從pg_control檔案中獲取REDO point進行恢復.
pg_control檔案位於$PGDATA/global目錄中,可透過命令pg_controldata檢視檔案中的內容.
[xdb@localhost pg111db]$ find ./ -name pg_control
./global/pg_control
[xdb@localhost pg111db]$ pg_controldata
pg_control version number: 1100
Catalog version number: 201809051
Database system identifier: 6624362124887945794
Database cluster state: in production
pg_control last modified: Thu 20 Dec 2018 03:34:05 PM CST
Latest checkpoint location: 1/48447DF0
Latest checkpoint's REDO location: 1/48447DF0
Latest checkpoint's REDO WAL file: 000000010000000100000048
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:1979
Latest checkpoint's NextOID: 25238
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 561
Latest checkpoint's oldestXID's DB: 16402
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 16402
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint: Thu 20 Dec 2018 03:34:05 PM CST
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: minimal
wal_log_hints setting: off
max_connections setting: 100
max_worker_processes setting: 8
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
Mock authentication nonce: 90bf37566703859a557b7f20688eb944b6335b5c3d36f5530941ebf1dfa777c1
[xdb@localhost pg111db]$
其中
Latest checkpoint's REDO location: 1/48447DF0
記錄的資訊即為REDO point.
二、REDO point的變更
1.檢視當前的REDO point
[xdb@localhost pg111db]$ pg_controldata|grep 'REDO location'
Latest checkpoint's REDO location: 1/48448150
REDO point為1/48448150.
2.執行DML操作
插入3條記錄
testdb=# insert into cp values(7);
INSERT 0 1
testdb=# insert into cp values(8);
INSERT 0 1
testdb=# insert into cp values(9);
INSERT 0 1
檢視1/48448150後的XLOG Record
[xdb@localhost pg_wal]$ pg_waldump -p ./ -s 1/48448150
rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn: 1/48448150, prev 1/484480E0, desc: CHECKPOINT_ONLINE redo 1/48448150; tli 1; prev tli 1; fpw true; xid 0:1982; oid 25238; multi 1; offset 0; oldest xid 561 in DB 16402; oldest multi 1 in DB 16402; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; online
rmgr: Heap len (rec/tot): 54/ 474, tx: 1982, lsn: 1/484481C0, prev 1/48448150, desc: INSERT off 11, blkref #0: rel 1663/16402/17046 blk 0 FPW
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1982, lsn: 1/484483A0, prev 1/484481C0, desc: COMMIT 2018-12-20 16:10:17.471639 CST
rmgr: Heap len (rec/tot): 59/ 59, tx: 1983, lsn: 1/484483C8, prev 1/484483A0, desc: INSERT off 12, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1983, lsn: 1/48448408, prev 1/484483C8, desc: COMMIT 2018-12-20 16:10:20.170594 CST
rmgr: Heap len (rec/tot): 59/ 59, tx: 1984, lsn: 1/48448430, prev 1/48448408, desc: INSERT off 13, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1984, lsn: 1/48448470, prev 1/48448430, desc: COMMIT 2018-12-20 16:10:22.268365 CST
pg_waldump: FATAL: error in WAL record at 1/48448470: invalid record length at 1/48448498: wanted 24, got 0
最後一個XLOG Record記錄的位置為1/48448470,加上記錄大小34(十六進位制為0x22),位置為1/48448492,按理論上來說,如果現在執行checkpoint,該位置為REDO point.
3.執行checkpoint
testdb=# checkpoint;
CHECKPOINT
檢視pg_control檔案內容
[xdb@localhost pg111db]$ pg_controldata|grep 'REDO location'
Latest checkpoint's REDO location: 1/48448498
再次檢視XLOG Record記錄
[xdb@localhost pg_wal]$ pg_waldump -p ./ -s 1/48448150
rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn: 1/48448150, prev 1/484480E0, desc: CHECKPOINT_ONLINE redo 1/48448150; tli 1; prev tli 1; fpw true; xid 0:1982; oid 25238; multi 1; offset 0; oldest xid 561 in DB 16402; oldest multi 1 in DB 16402; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; online
rmgr: Heap len (rec/tot): 54/ 474, tx: 1982, lsn: 1/484481C0, prev 1/48448150, desc: INSERT off 11, blkref #0: rel 1663/16402/17046 blk 0 FPW
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1982, lsn: 1/484483A0, prev 1/484481C0, desc: COMMIT 2018-12-20 16:10:17.471639 CST
rmgr: Heap len (rec/tot): 59/ 59, tx: 1983, lsn: 1/484483C8, prev 1/484483A0, desc: INSERT off 12, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1983, lsn: 1/48448408, prev 1/484483C8, desc: COMMIT 2018-12-20 16:10:20.170594 CST
rmgr: Heap len (rec/tot): 59/ 59, tx: 1984, lsn: 1/48448430, prev 1/48448408, desc: INSERT off 13, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1984, lsn: 1/48448470, prev 1/48448430, desc: COMMIT 2018-12-20 16:10:22.268365 CST
rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn: 1/48448498, prev 1/48448470, desc: CHECKPOINT_ONLINE redo 1/48448498; tli 1; prev tli 1; fpw true; xid 0:1985; oid 25238; multi 1; offset 0; oldest xid 561 in DB 16402; oldest multi 1 in DB 16402; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; online
pg_waldump: FATAL: error in WAL record at 1/48448498: invalid record length at 1/48448508: wanted 24, got 0
[xdb@localhost pg_wal]$
可以看到1/48448498是最後一條checkpoint記錄的起始寫入位置,REDO point為1/48448498,比預想的要多出6個位元組(這6個位元組都是0x00,用於補齊?).
4.dump WAL segment file
再次使用hexdump工具檢視WAL segment file檔案.
1/48448470開始的"COMMIT"記錄
[xdb@localhost pg_wal]$ echo "obase=10;ibase=16;448470"|bc
4490352 --> 檔案位置偏移
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000048 -s 4490352 -n 34
00448470 22 00 00 00 c0 07 00 00 30 84 44 48 01 00 00 00 |".......0.DH....|
00448480 00 01 00 00 83 5e 7a d4 ff 08 cd 2f 4a 6e 6e 20 |.....^z..../Jnn |
00448490 02 00 |..|
00448492
XLOG Record的頭部首先是XLogRecord結構體,第一個域是uint32的record的大小,即0x00000022,十進位制為34(大小無異).
後續的6個位元組
均為0x00
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000048 -s 4490386 -n 6
00448492 00 00 00 00 00 00 |......|
00448498
XLOG Record for checkpoint
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000048 -s 4490392 -n 106
00448498 6a 00 00 00 00 00 00 00 70 84 44 48 01 00 00 00 |j.......p.DH....|
004484a8 10 00 00 00 ac d2 8b 95 ff 50 98 84 44 48 01 00 |.........P..DH..|
004484b8 00 00 01 00 00 00 01 00 00 00 01 00 00 00 00 00 |................|
004484c8 00 00 c1 07 00 00 96 62 00 00 01 00 00 00 00 00 |.......b........|
004484d8 00 00 31 02 00 00 12 40 00 00 01 00 00 00 12 40 |..1....@.......@|
004484e8 00 00 ba 4e 1b 5c 00 00 00 00 00 00 00 00 00 00 |...N.\..........|
004484f8 00 00 00 00 00 00 00 00 00 00 |..........|
00448502
大小為0x0000006A,即106B(頭部XLogRecord 24B + XLogRecordDataHeaderShort 2B + Checkpoint 80B ).
從FF 50為XLogRecordDataHeaderShort結構體的內容,0xFF為標誌位,0x50為Data的大小(即80B).
checkpoint記錄的內容詳見Checkpoint結構體,該結構體第一個域欄位為8個位元組的LSN-->0x00000001 48448498,
即REDO point:1/48448498.
三、參考資料
PostgreSQL 原始碼解讀(109)- WAL#5(相關資料結構)
PostgreSQL DBA(15) - WAL檔案結構
PostgreSQL DBA(16) - WAL segment file內部結構
PostgreSQL DBA(17) - XLOG Record data內部結構
PostgreSQL DBA(18) - pg_waldump工具簡介
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/6906/viewspace-2374774/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- PostgreSQL DBA(20) - WAL full-page-write淺析SQL
- 淺析MySQL事務中的redo與undoMySql
- PostgreSQL MVCC快照機制淺析SQLMVC
- PostgreSQL DBA(45) - Hypothetical Indexes in PostgreSQLSQLIndex
- PostgreSQL DBA(58) - DBLinkSQL
- PostgreSQL DBA(35) - CTESQL
- PostgreSQL DBA(42) - localeSQL
- PostgreSQL DBA(191) - CollationSQL
- PostgreSQL DBA(185) - watchSQL
- PostgreSQL DBA(182) - HOTSQL
- PostgreSQL DBA(133) - Extension(postgresql_anonymizer)SQL
- PostgreSQL DBA(181) - Using PostgreSQL as a Data WarehouseSQL
- PostgreSQL DBA(91) - PG upgradeSQL
- PostgreSQL DBA(59) - Index(Bloom)SQLIndexOOM
- PostgreSQL DBA(150) - Extension(pgmetrics)SQL
- PostgreSQL DBA(52) - Index(RUM)SQLIndex
- PostgreSQL DBA(47) - Index(Btree)SQLIndex
- PostgreSQL DBA(48) - Index(GiST)SQLIndex
- PostgreSQL DBA(51) - Index(GIN)SQLIndex
- PostgreSQL DBA(43) - Index(Hash)SQLIndex
- PostgreSQL DBA(53) - Index(BRIN)SQLIndex
- PostgreSQL DBA(187) - TCP keepaliveSQLTCP
- PostgreSQL DBA(186) - SQL Group BySQL
- PostgreSQL DBA(184) - Extension(hypoPG)SQL
- PostgreSQL DBA(142) - PG 12(Monitoring PostgreSQL VACUUM processes)SQL
- PostgreSQL DBA(7) - pgbench簡介SQL
- PostgreSQL DBA(8) - sysbench簡介SQL
- PostgreSQL DBA(136) - Develop(Common Mistakes)SQLdev
- PostgreSQL DBA(125) - PG 12(TPCC)SQL
- PostgreSQL DBA(122) - Develop(EDB package)SQLdevPackage
- PostgreSQL DBA(120) - pgAdmin(HA with Patroni)SQL
- PostgreSQL DBA(86) - Table Parameter(fillfactor)SQL
- PostgreSQL DBA(87) - benchmarksql簡介SQL
- PostgreSQL DBA(80) - Object Identifier TypesSQLObjectIDE
- PostgreSQL DBA(73) - dump函式SQL函式
- PostgreSQL DBA(77) - Locks(Lock Monitoring)SQL
- PostgreSQL DBA(158) - Locks(tuple locks)SQL
- PostgreSQL DBA(168) - pgAdmin(config:checkpoint)SQL