PostgreSQL邏輯複製資料同步到kafka
目前CDC(Change Data Capture)工具還是比較多的 ,現在有部分增量資料同步工具是基於觸發器實現的,把資料庫的變化記錄到某個系統表中,然後在客戶端建立緩衝,並定期將變化push到接收端,接收端和傳送端建立通道,並定期pull新的資料變化,在目標資料庫回放 。但因為使用了觸發器會影響生產環境的效能,這點可能無法接受 ,其他的增量資料同步工具中佔比比較大的一類,是基於kafka來實現的。例如RedHat開源的 Debezium。通過抽取資料庫日誌來獲取變更,實現增量資料同步。下面是一個PostgreSQL資料同步至kafka 的例子:
一、準備java環境
[root@ysla bin]# yum install java-1.8.0-openjdk.x86_64 [root@ysla bin]# yum install java-1.8.0-openjdk-devel.x86_64 [root@ysla bin]# java -version openjdk version "1.8.0_322" OpenJDK Runtime Environment (build 1.8.0_322-b06) OpenJDK 64-Bit Server VM (build 25.322-b06, mixed mode)
二、啟動zookeeper
[root@ysla ~]# tar -xf apache-zookeeper-3.8.0-bin.tar.gz [root@ysla ~]# cd apache-zookeeper-3.8.0-bin/ [root@ysla apache-zookeeper-3.8.0-bin]# ls bin conf docs lib LICENSE.txt NOTICE.txt README.md README_packaging.md [root@ysla apache-zookeeper-3.8.0-bin]# cd conf/ [root@ysla conf]# ls configuration.xsl logback.xml zoo_sample.cfg [root@ysla conf]# cp zoo_sample.cfg zoo.cfg [root@ysla bin]# pwd /root/apache-zookeeper-3.8.0-bin/bin [root@ysla bin]# ./zkServer.sh start /bin/java ZooKeeper JMX enabled by default Using config: /root/apache-zookeeper-3.8.0-bin/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@ysla bin]# jps 65408 Jps 65355 QuorumPeerMain
可以使用zkCli.sh進行連線測試
[root@ysla bin]# ./zkCli.sh -server 127.0.0.1:2181
三、啟動kafka
[root@ysla ~]# tar -xf kafka_2.13-3.1.0.tgz [root@ysla ~]# cd kafka_2.13-3.1.0/ [root@ysla kafka_2.13-3.1.0]# ls bin config libs LICENSE licenses NOTICE site-docs
因為我zookeeper也安在本地,所以server.properties不用修改,走的localhost
[root@ysla bin]# ./kafka-server-start.sh ../config/server.properties & [root@ysla bin]# jps 74677 Kafka 75227 Jps 65355 QuorumPeerMain
四、安裝kafkacat
安裝依賴
[root@ysla kafkacat]# yum install -y librdkafka-devel
安裝kafkacat
[root@ysla ~]# git clone Cloning into 'kafkacat'... remote: Enumerating objects: 1550, done. remote: Counting objects: 100% (280/280), done. remote: Compressing objects: 100% (177/177), done. remote: Total 1550 (delta 157), reused 195 (delta 93), pack-reused 1270 Receiving objects: 100% (1550/1550), 627.51 KiB | 221.00 KiB/s, done. Resolving deltas: 100% (935/935), done. [root@ysla ~]# cd kafkacat [root@ysla kafkacat]# ./configure [root@ysla kafkacat]# make -j 24 [root@ysla kafkacat]# make install -j 24 [root@ysla kafkacat]# kcat --help Usage: kcat <options> [file1 file2 .. | topic1 topic2 ..]] kcat - Apache Kafka producer and consumer tool Copyright (c) 2014-2021, Magnus Edenhill Version 1.7.1-2-g338ae3 (librdkafka 0.11.4 builtin.features=gzip,snappy,ssl,sasl,regex,lz4,sasl_gssapi,sasl_plain,sasl_scram,plugins) General options: -C | -P | -L | -Q Mode: Consume, Produce, Metadata List, Query mode -G <group-id> Mode: High-level KafkaConsumer (Kafka >=0.9 balanced consumer groups) Expects a list of topics to subscribe to -t <topic> Topic to consume from, produce to, or list -p <partition> Partition -b <brokers,..> Bootstrap broker(s) (host[:port]) -D <delim> Message delimiter string: a-z | \r | \n | \t | \xNN .. Default: \n -K <delim> Key delimiter (same format as -D) -c <cnt> Limit message count -m <seconds> Metadata (et.al.) request timeout. This limits how long kcat will block while waiting for initial metadata to be retrieved from the Kafka cluster. It also sets the timeout for the producer's transaction commits, init, aborts, etc. Default: 5 seconds.
五、資料庫安裝wal2json
我的下載地址為
[root@ysla local]# tar -xf wal2json-wal2json_2_3.tar.gz [root@ysla wal2json-wal2json_2_3]# chown -R postgres: wal2json-wal2json_2_3/ [root@ysla local]# su - postgres Last login: Wed Mar 23 15:52:23 CST 2022 on pts/1 [postgres@ysla ~]$ cd /usr/local/wal2json-wal2json_2_3/ [postgres@ysla wal2json-wal2json_2_3]$ ls expected LICENSE Makefile README.md sql wal2json.c wal2json.vcxproj [postgres@ysla local]$ export PATH=/opt/pg12/bin/pg_config:$PATH [postgres@ysla local]$ make -j 24 make: *** No targets specified and no makefile found. Stop. [postgres@ysla local]$ ls bin etc games include lib lib64 libexec psql sbin share src wal2json-wal2json_2_3 wal2json-wal2json_2_3.tar.gz [postgres@ysla local]$ cd wal2json-wal2json_2_3/ [postgres@ysla wal2json-wal2json_2_3]$ make -j 24 gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/pg12/include/postgresql/server -I/opt/pg12/include/postgresql/internal -D_GNU_SOURCE -c -o wal2json.o wal2json.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC wal2json.o -L/opt/pg12/lib -Wl,--as-needed -Wl,-rpath,'/opt/pg12/lib',--enable-new-dtags -shared -o wal2json.so [postgres@ysla wal2json-wal2json_2_3]$ make install -j 24 /bin/mkdir -p '/opt/pg12/lib/postgresql' /bin/install -c -m 755 wal2json.so '/opt/pg12/lib/postgresql/' [postgres@ysla wal2json-wal2json_2_3]$ ls expected LICENSE Makefile README.md sql wal2json.c wal2json.o wal2json.so wal2json.vcxproj [postgres@ysla wal2json-wal2json_2_3]$ ll /opt/pg12/lib/postgresql/wal2json.so -rwxr-xr-x 1 postgres dba 49224 Mar 23 16:01 /opt/pg12/lib/postgresql/wal2json.so
postgresql.conf裡新增shared_preload_libraries = 'wal2json.so’並修改wal_level = logical,之後重啟資料庫
六、造測試資料並消費
建立複製槽
[postgres@ysla ~]$ /opt/pg12/bin/pg_recvlogical -d postgres --slot slot1 --create-slot -P wal2json [postgres@ysla ~]$ psql -c "select * from pg_replication_slots;" slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn -----------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+--------------------- slot1 | wal2json | logical | 13593 | postgres | f | f | | | 855 | 0/2A03FE20 | 0/2A03FE58 (1 row)
我這裡為了方便使用本地postgres使用者。如有需求,按這樣建立使用者CREATE ROLR uysl WITH REPLICATION PASSWORD ‘QAZqaz123’ LOGIN;並修改pg_hba.conf,使該使用者可以遠端或本地訪問資料庫。
建立測試表
CREATE TABLE test_table ( id char(10) NOT NULL, code char(10), PRIMARY KEY (id) );
開啟一個終端,生產資料到本地 kafka
[root@ysla ~]# /opt/pg12/bin/pg_recvlogical -h localhost -p 6000 -U postgres -d postgres -S slot1 --start -f - | /root/kafkacat/kcat -b 172.20.10.3:9092 -t testdb_topic % Auto-selecting Producer mode (use -P or -C to override)
消費testdb_topic
[root@ysla kafka_2.13-3.1.0]# pwd /root/kafka_2.13-3.1.0 [root@ysla kafka_2.13-3.1.0]# bin/kafka-console-consumer.sh --topic testdb_topic --bootstrap-server 172.20.10.3:9092 --from-beginning
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/69990629/viewspace-2885184/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- PostgreSQL 邏輯複製解密SQL解密
- PostgreSQL邏輯複製解密SQL解密
- PostgreSQL,SQLServer邏輯增量(通過邏輯標記update,delete)同步到Greenplum,PostgreSQLSQLServerdelete
- 240815-PostgreSQL自帶邏輯複製簡單使用SQL
- openGauss-邏輯複製
- PostgreSQL 13 同步流複製(#2.4)-202104SQL
- LightDB/Postgres邏輯複製的搭建
- PostgreSQL 13 非同步流複製(#2.1)-202103SQL非同步
- PostgreSQL 13 同步流複製+failover(#2.6)-202104SQLAI
- MongoDB複製集資料同步流程MongoDB
- Kafka資料每5分鐘同步到HiveKafkaHive
- PostgreSQL:邏輯結構SQL
- AWS Aurora postgresql12同一個例項之間做邏輯複製步驟SQL
- postgresql從入門到精通教程 - 第36講:postgresql邏輯備份SQL
- PostgreSQL 13 同步流複製+延遲備庫(#2.5)-202104SQL
- PostgreSQL 13 非同步流複製+failover切換(#2.3)-202104SQL非同步AI
- HGDB企業版V6邏輯複製搭建
- Flinkx實時和離線同步Postgresql資料到KafkaSQLKafka
- Clickhouse Engine kafka 將kafka資料同步clickhouseKafka
- SQLServer複製到execl丟失資料SQLServer
- PostgreSQL10.1手冊_部分III.伺服器管理_第31章邏輯複製_31.7.安全SQL伺服器
- 如何將SQL寫成複雜邏輯 和構造資料SQL
- PostgreSQL 13 非同步流複製+延遲備庫(#2.2)-202103SQL非同步
- MySQL-主從複製之同步主從資料MySql
- Redis基礎篇(六)資料同步:主從複製Redis
- MySQL 8 複製(一)——非同步複製MySql非同步
- MySQL 8 複製(二)——半同步複製MySql
- linux下mysql主從複製,實現資料庫同步LinuxMySql資料庫
- MySQL主從複製之半同步複製MySql
- MySQL主從複製之非同步複製MySql非同步
- 使用事件溯源、Kafka和OGG從Oracle內部複製資料事件KafkaOracle
- PostgreSQL 主從複製方案SQL
- PostgreSQL雙向複製教程SQL
- 2. PostgreSQL 流複製SQL
- DM7資料複製之資料庫級複製資料庫
- 資料共享(淺複製)與資料獨立(深複製)
- 資料庫複製(一)–複製介紹資料庫
- 【PG流複製】Postgresql流複製主備切換SQL