PostgreSQL邏輯複製資料同步到kafka
目前CDC(Change Data Capture)工具還是比較多的 ,現在有部分增量資料同步工具是基於觸發器實現的,把資料庫的變化記錄到某個系統表中,然後在客戶端建立緩衝,並定期將變化push到接收端,接收端和傳送端建立通道,並定期pull新的資料變化,在目標資料庫回放 。但因為使用了觸發器會影響生產環境的效能,這點可能無法接受 ,其他的增量資料同步工具中佔比比較大的一類,是基於kafka來實現的。例如RedHat開源的 Debezium。透過抽取資料庫日誌來獲取變更,實現增量資料同步。下面是一個PostgreSQL資料同步至kafka 的例子:
一、準備java環境
[root@ysla bin]# yum install java-1.8.0-openjdk.x86_64 [root@ysla bin]# yum install java-1.8.0-openjdk-devel.x86_64 [root@ysla bin]# java -version openjdk version "1.8.0_322" OpenJDK Runtime Environment (build 1.8.0_322-b06) OpenJDK 64-Bit Server VM (build 25.322-b06, mixed mode)
二、啟動zookeeper
[root@ysla ~]# tar -xf apache-zookeeper-3.8.0-bin.tar.gz [root@ysla ~]# cd apache-zookeeper-3.8.0-bin/ [root@ysla apache-zookeeper-3.8.0-bin]# ls bin conf docs lib LICENSE.txt NOTICE.txt README.md README_packaging.md [root@ysla apache-zookeeper-3.8.0-bin]# cd conf/ [root@ysla conf]# ls configuration.xsl logback.xml zoo_sample.cfg [root@ysla conf]# cp zoo_sample.cfg zoo.cfg [root@ysla bin]# pwd /root/apache-zookeeper-3.8.0-bin/bin [root@ysla bin]# ./zkServer.sh start /bin/java ZooKeeper JMX enabled by default Using config: /root/apache-zookeeper-3.8.0-bin/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@ysla bin]# jps 65408 Jps 65355 QuorumPeerMain
可以使用zkCli.sh進行連線測試
[root@ysla bin]# ./zkCli.sh -server 127.0.0.1:2181
三、啟動kafka
[root@ysla ~]# tar -xf kafka_2.13-3.1.0.tgz [root@ysla ~]# cd kafka_2.13-3.1.0/ [root@ysla kafka_2.13-3.1.0]# ls bin config libs LICENSE licenses NOTICE site-docs
因為我zookeeper也安在本地,所以server.properties不用修改,走的localhost
[root@ysla bin]# ./kafka-server-start.sh ../config/server.properties & [root@ysla bin]# jps 74677 Kafka 75227 Jps 65355 QuorumPeerMain
四、安裝kafkacat
安裝依賴
[root@ysla kafkacat]# yum install -y librdkafka-devel
安裝kafkacat
[root@ysla ~]# git clone Cloning into 'kafkacat'... remote: Enumerating objects: 1550, done. remote: Counting objects: 100% (280/280), done. remote: Compressing objects: 100% (177/177), done. remote: Total 1550 (delta 157), reused 195 (delta 93), pack-reused 1270 Receiving objects: 100% (1550/1550), 627.51 KiB | 221.00 KiB/s, done. Resolving deltas: 100% (935/935), done. [root@ysla ~]# cd kafkacat [root@ysla kafkacat]# ./configure [root@ysla kafkacat]# make -j 24 [root@ysla kafkacat]# make install -j 24 [root@ysla kafkacat]# kcat --help Usage: kcat <options> [file1 file2 .. | topic1 topic2 ..]] kcat - Apache Kafka producer and consumer tool Copyright (c) 2014-2021, Magnus Edenhill Version 1.7.1-2-g338ae3 (librdkafka 0.11.4 builtin.features=gzip,snappy,ssl,sasl,regex,lz4,sasl_gssapi,sasl_plain,sasl_scram,plugins) General options: -C | -P | -L | -Q Mode: Consume, Produce, Metadata List, Query mode -G <group-id> Mode: High-level KafkaConsumer (Kafka >=0.9 balanced consumer groups) Expects a list of topics to subscribe to -t <topic> Topic to consume from, produce to, or list -p <partition> Partition -b <brokers,..> Bootstrap broker(s) (host[:port]) -D <delim> Message delimiter string: a-z | \r | \n | \t | \xNN .. Default: \n -K <delim> Key delimiter (same format as -D) -c <cnt> Limit message count -m <seconds> Metadata (et.al.) request timeout. This limits how long kcat will block while waiting for initial metadata to be retrieved from the Kafka cluster. It also sets the timeout for the producer's transaction commits, init, aborts, etc. Default: 5 seconds.
五、資料庫安裝wal2json
我的下載地址為
[root@ysla local]# tar -xf wal2json-wal2json_2_3.tar.gz [root@ysla wal2json-wal2json_2_3]# chown -R postgres: wal2json-wal2json_2_3/ [root@ysla local]# su - postgres Last login: Wed Mar 23 15:52:23 CST 2022 on pts/1 [postgres@ysla ~]$ cd /usr/local/wal2json-wal2json_2_3/ [postgres@ysla wal2json-wal2json_2_3]$ ls expected LICENSE Makefile README.md sql wal2json.c wal2json.vcxproj [postgres@ysla local]$ export PATH=/opt/pg12/bin/pg_config:$PATH [postgres@ysla local]$ make -j 24 make: *** No targets specified and no makefile found. Stop. [postgres@ysla local]$ ls bin etc games include lib lib64 libexec psql sbin share src wal2json-wal2json_2_3 wal2json-wal2json_2_3.tar.gz [postgres@ysla local]$ cd wal2json-wal2json_2_3/ [postgres@ysla wal2json-wal2json_2_3]$ make -j 24 gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/pg12/include/postgresql/server -I/opt/pg12/include/postgresql/internal -D_GNU_SOURCE -c -o wal2json.o wal2json.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC wal2json.o -L/opt/pg12/lib -Wl,--as-needed -Wl,-rpath,'/opt/pg12/lib',--enable-new-dtags -shared -o wal2json.so [postgres@ysla wal2json-wal2json_2_3]$ make install -j 24 /bin/mkdir -p '/opt/pg12/lib/postgresql' /bin/install -c -m 755 wal2json.so '/opt/pg12/lib/postgresql/' [postgres@ysla wal2json-wal2json_2_3]$ ls expected LICENSE Makefile README.md sql wal2json.c wal2json.o wal2json.so wal2json.vcxproj [postgres@ysla wal2json-wal2json_2_3]$ ll /opt/pg12/lib/postgresql/wal2json.so -rwxr-xr-x 1 postgres dba 49224 Mar 23 16:01 /opt/pg12/lib/postgresql/wal2json.so
postgresql.conf裡新增shared_preload_libraries = 'wal2json.so’並修改wal_level = logical,之後重啟資料庫
六、造測試資料並消費
建立複製槽
[postgres@ysla ~]$ /opt/pg12/bin/pg_recvlogical -d postgres --slot slot1 --create-slot -P wal2json [postgres@ysla ~]$ psql -c "select * from pg_replication_slots;" slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn -----------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+--------------------- slot1 | wal2json | logical | 13593 | postgres | f | f | | | 855 | 0/2A03FE20 | 0/2A03FE58 (1 row)
我這裡為了方便使用本地postgres使用者。如有需求,按這樣建立使用者CREATE ROLR uysl WITH REPLICATION PASSWORD ‘QAZqaz123’ LOGIN;並修改pg_hba.conf,使該使用者可以遠端或本地訪問資料庫。
建立測試表
CREATE TABLE test_table ( id char(10) NOT NULL, code char(10), PRIMARY KEY (id) );
開啟一個終端,生產資料到本地 kafka
[root@ysla ~]# /opt/pg12/bin/pg_recvlogical -h localhost -p 6000 -U postgres -d postgres -S slot1 --start -f - | /root/kafkacat/kcat -b 172.20.10.3:9092 -t testdb_topic % Auto-selecting Producer mode (use -P or -C to override)
消費testdb_topic
[root@ysla kafka_2.13-3.1.0]# pwd /root/kafka_2.13-3.1.0 [root@ysla kafka_2.13-3.1.0]# bin/kafka-console-consumer.sh --topic testdb_topic --bootstrap-server 172.20.10.3:9092 --from-beginning
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/69990629/viewspace-2885184/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- PostgreSQL邏輯複製解密SQL解密
- PostgreSQL 邏輯複製解密SQL解密
- PostgreSQL,SQLServer邏輯增量(通過邏輯標記update,delete)同步到Greenplum,PostgreSQLSQLServerdelete
- 240815-PostgreSQL自帶邏輯複製簡單使用SQL
- 邏輯複製主從搭建
- openGauss-邏輯複製
- PostgreSQL10.0preview功能增強-邏輯複製支援並行COPY初始化資料SQLView並行
- Postgresql 9.6 搭建 非同步流複製 和 同步流複製 詳細教程SQL非同步
- 優化邏輯Standby的資料同步效能優化
- LightDB/Postgres邏輯複製的搭建
- MongoDB複製集資料同步流程MongoDB
- Kafka資料每5分鐘同步到HiveKafkaHive
- PostgreSQL:邏輯結構SQL
- AWS Aurora postgresql12同一個例項之間做邏輯複製步驟SQL
- postgresql從入門到精通教程 - 第36講:postgresql邏輯備份SQL
- Sqlstream:將MySQL複製事件流化到Apache KafkaMySql事件ApacheKafka
- 【AIX 命令學習】複製邏輯卷 cplvAI
- SymmetricDS 3.5.0 釋出,資料同步和複製
- HGDB企業版V6邏輯複製搭建
- SQLServer複製到execl丟失資料SQLServer
- PostgreSQL 13 同步流複製(#2.4)-202104SQL
- Clickhouse Engine kafka 將kafka資料同步clickhouseKafka
- 網路資料庫的複製和同步(轉)資料庫
- MySQL的非同步複製和半同步複製MySql非同步
- mysql資料庫資料同步/主從複製的配置方法MySql資料庫
- Flinkx實時和離線同步Postgresql資料到KafkaSQLKafka
- PostgreSQL 13 非同步流複製(#2.1)-202103SQL非同步
- 如何將SQL寫成複雜邏輯 和構造資料SQL
- 使用 Bulk Copy 將大量資料複製到資料庫資料庫
- MySQL級聯複製中資料同步MySql
- MySQL-主從複製之同步主從資料MySql
- Redis基礎篇(六)資料同步:主從複製Redis
- mongoDB研究筆記:複製集資料同步機制MongoDB筆記
- 多級複製的資料不同步問題
- MySQL 8 複製(一)——非同步複製MySql非同步
- MySQL 8 複製(二)——半同步複製MySql
- PostgreSQL 13 同步流複製+failover(#2.6)-202104SQLAI
- 邏輯資料庫的管理資料庫