PostgreSQL10.0preview功能增強-邏輯複製支援並行COPY初始化資料
標籤
PostgreSQL , 10.0 , 邏輯複製 , 初始資料COPY
背景
PostgreSQL 已支援邏輯複製,同時對邏輯複製增加了一個初始同步的增強功能,支援通過wal receiver協議跑COPY命令(已封裝在邏輯複製的核心程式碼中),支援多表並行。
也就是說,你可以使用PostgreSQL的邏輯複製,快速的(流式、並行)將一個例項遷移到另一個例項。
Logical replication support for initial data copy
Add functionality for a new subscription to copy the initial data in the
tables and then sync with the ongoing apply process.
For the copying, add a new internal COPY option to have the COPY source
data provided by a callback function. The initial data copy works on
the subscriber by receiving COPY data from the publisher and then
providing it locally into a COPY that writes to the destination table.
A WAL receiver can now execute full SQL commands. This is used here to
obtain information about tables and publications.
Several new options were added to CREATE and ALTER SUBSCRIPTION to
control whether and when initial table syncing happens.
Change pg_dump option --no-create-subscription-slots to
--no-subscription-connect and use the new CREATE SUBSCRIPTION
... NOCONNECT option for that.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Tested-by: Erik Rijkers <er@xs4all.nl>
邏輯複製包含的初始化COPY的流程如下
主庫開啟事務快照(快照支援在多個會話間共享, 這也是PostgreSQL的獨門祕籍之一), COPY資料, COPY結束後釋放快照, 從快照對應的WAL LSN開始接收增量.
/*-------------------------------------------------------------------------
2 * tablesync.c
3 * PostgreSQL logical replication
4 *
5 * Copyright (c) 2012-2016, PostgreSQL Global Development Group
6 *
7 * IDENTIFICATION
8 * src/backend/replication/logical/tablesync.c
9 *
10 * NOTES
11 * This file contains code for initial table data synchronization for
12 * logical replication.
13 *
14 * The initial data synchronization is done separately for each table,
15 * in separate apply worker that only fetches the initial snapshot data
16 * from the publisher and then synchronizes the position in stream with
17 * the main apply worker.
18 *
19 * The are several reasons for doing the synchronization this way:
20 * - It allows us to parallelize the initial data synchronization
21 * which lowers the time needed for it to happen.
22 * - The initial synchronization does not have to hold the xid and LSN
23 * for the time it takes to copy data of all tables, causing less
24 * bloat and lower disk consumption compared to doing the
25 * synchronization in single process for whole database.
26 * - It allows us to synchronize the tables added after the initial
27 * synchronization has finished.
28 *
29 * The stream position synchronization works in multiple steps.
30 * - Sync finishes copy and sets table state as SYNCWAIT and waits
31 * for state to change in a loop.
32 * - Apply periodically checks tables that are synchronizing for SYNCWAIT.
33 * When the desired state appears it will compare its position in the
34 * stream with the SYNCWAIT position and based on that changes the
35 * state to based on following rules:
36 * - if the apply is in front of the sync in the wal stream the new
37 * state is set to CATCHUP and apply loops until the sync process
38 * catches up to the same LSN as apply
39 * - if the sync is in front of the apply in the wal stream the new
40 * state is set to SYNCDONE
41 * - if both apply and sync are at the same position in the wal stream
42 * the state of the table is set to READY
43 * - If the state was set to CATCHUP sync will read the stream and
44 * apply changes until it catches up to the specified stream
45 * position and then sets state to READY and signals apply that it
46 * can stop waiting and exits, if the state was set to something
47 * else than CATCHUP the sync process will simply end.
48 * - If the state was set to SYNCDONE by apply, the apply will
49 * continue tracking the table until it reaches the SYNCDONE stream
50 * position at which point it sets state to READY and stops tracking.
51 *
52 * The catalog pg_subscription_rel is used to keep information about
53 * subscribed tables and their state and some transient state during
54 * data synchronization is kept in shared memory.
55 *
56 * Example flows look like this:
57 * - Apply is in front:
58 * sync:8
59 * -> set SYNCWAIT
60 * apply:10
61 * -> set CATCHUP
62 * -> enter wait-loop
63 * sync:10
64 * -> set READY
65 * -> exit
66 * apply:10
67 * -> exit wait-loop
68 * -> continue rep
69 * - Sync in front:
70 * sync:10
71 * -> set SYNCWAIT
72 * apply:8
73 * -> set SYNCDONE
74 * -> continue per-table filtering
75 * sync:10
76 * -> exit
77 * apply:10
78 * -> set READY
79 * -> stop per-table filtering
80 * -> continue rep
81 *-------------------------------------------------------------------------
82 */
83
這個patch的討論,詳見郵件組,本文末尾URL。
PostgreSQL社群的作風非常嚴謹,一個patch可能在郵件組中討論幾個月甚至幾年,根據大家的意見反覆的修正,patch合併到master已經非常成熟,所以PostgreSQL的穩定性也是遠近聞名的。
參考
相關文章
- PostgreSQL10.0preview功能增強-邏輯訂閱端控制引數解說SQLView
- PostgreSQL10.0preview功能增強-國際化功能增強,支援ICU(InternationalComponentsforUnicode)SQLViewUnicode
- PostgreSQL邏輯複製資料同步到kafkaSQLKafka
- PostgreSQL邏輯複製解密SQL解密
- PostgreSQL 邏輯複製解密SQL解密
- PostgreSQL10.0preview功能增強-後臺執行(pg_background)SQLView
- PostgreSQL10.0preview功能增強-增加ProcArrayGroupUpdate等待事件SQLView事件
- 邏輯複製主從搭建
- PostgreSQL10.0preview效能增強-支援64bitatomicSQLView
- 使用 Bulk Copy 將大量資料複製到資料庫資料庫
- PostgreSQL10.0preview功能增強-OLAP增強向量聚集索引(列儲存擴充套件)SQLView索引套件
- PostgreSQL10.0preview功能增強-CLOGoldestXID跟蹤SQLViewGo
- LightDB/Postgres邏輯複製的搭建
- 使用dbms_schema_copy 進行不同使用者間資料複製
- 基於Gor實現流量複製(加middleware功能增強)Go
- PostgreSQL10.0preview功能增強-JSON內容全文檢索SQLViewJSON
- MySQL · 功能分析 · 5.6 並行複製實現分析MySql並行
- PostgreSQL10.0preview功能增強-動態檢視pg_stat_activity新增資料庫管理程式資訊SQLView資料庫
- PostgreSQL10.0preview效能增強-hashindexmetapagecache、高併發增強SQLViewIndex
- PostgreSQL10.0preview功能增強-自由定義統計資訊維度SQLView
- mysql 並行複製原理MySql並行
- MySQL 5.7並行複製MySql並行
- MySQL 5.7 並行複製MySql並行
- 【AIX 命令學習】複製邏輯卷 cplvAI
- PostgreSQL10.0preview功能增強-觸發器函式內建中間表SQLView觸發器函式
- ASM FTP 功能複製 ASM資料庫ASMFTP資料庫
- MySQL增強(Loss-less)半同步複製MySql
- 【MySQL】半同步與增強半同步複製MySql
- HGDB企業版V6邏輯複製搭建
- PostgreSQL10.0preview功能增強-兩段式索引(約束欄位+附加欄位)SQLView索引
- PostgreSQL10.0preview效能增強-分割槽表效能增強(plan階段加速)SQLView
- vba之小功能記錄--資料複製
- [Mysql]Mysql5.7並行複製MySql並行
- mysql 5.7開啟並行複製MySql並行
- PostgreSQL10.0preview功能增強-序列隔離級別預加鎖閾值可控SQLView
- 【SqlServer】【Oracle】sql複製表定義及複製資料行SQLServerOracle
- 如何將SQL寫成複雜邏輯 和構造資料SQL
- 冷備_並行copy datafile並行