Flink DataStream/API
未變的重要特性
雖然官宣建議棄用 JDK 8,使用JDK 11+;但:仍繼續支援 JDK 8
個人猜測:JDK 8 的使用者群實在太大,牽一髮而動全身,防止步子扯太大,遏制自身專案的發展勢頭。
依賴模組的變化
版本變化
- flink.version : 1.12.6 => 1.15.4
- flink.connector.version : 1.12.6 => 1.15.4
- flink.connector.cdc.version : 1.3.0 => 2.3.0
Flink Cdc : flink cdc 2.0.0 之後,【groupId、包路徑】 從 com.alibaba.ververica
變為 com.ververica
- apache flink cdc 1.3.0
<dependency>
<groupId>com.alibaba.ververica</groupId>
<artifactId>flink-connector-mysql-cdc</artifactId>
<version>1.3.0</version>
</dependency>
- apache flink cdc 2.3.0
<dependency>
<groupId>com.alibaba.ververica</groupId>
<artifactId>flink-connector-mysql-cdc</artifactId>
<version>2.3.0</version>
</dependency>
- 詳情參見:
- Flink CDC 官網: Flink CDC 包 && Flink && JDK && MYSQL 的版本對照 - 部落格園/千千寰宇
各模組擺脫了 scala
詳情參見:
https://github.com/apache/flink/blob/release-1.15.4/docs/content.zh/release-notes/flink-1.15.md 【推薦】
https://nightlies.apache.org/flink/flink-docs-release-1.15/release-notes/flink-1.15/
-
org.apache.flink:flink-clients:${flink.version}
-
flink-streaming-java:
-
org.apache.flink:flink-table-api-java-bridge
org.apache.flink:flink-table-api-java-bridge_${scala.version}:${flink.version}
-
org.apache.flink:flink-connector-kafka:${flink.version}
-
org.apache.flink:flink-runtime-web:${flink.version}
-
`org.apache.flink:flink-statebackend-rocksdb:${flink.version}``
-
org.apache.flink:flink-table-planner:${flink.version}
org.apache.flink:flink-table-planner-blink_${scala.version}:${flink.version}
table-*-blink
轉正 : flink-table-planner/runtime-blink => flink-table-planner、flink-table-runtime
- 從 Flink 1.15 開始,發行版包含兩個規劃器:
flink-table-planner_2.12-${flink.version}.jar
: in /opt, 包含查詢規劃器flink-table-planner-loader-${flink.version}.jar
【推薦】 : 預設載入/lib
,包含隱藏在隔離類路徑後面的查詢計劃器
注意:這2個規劃器(planner_2)不能同時存在於類路徑中。如果將它們都載入到
/lib
表作業中,則會失敗,報錯Could not instantiate the executor. Make sure a planner module is on the classpath
。
Exception in thread "main" org.apache.flink.table.api.TableException: Could not instantiate the executor. Make sure a planner module is on the classpath
at org.apache.flink.table.api.bridge.internal.AbstractStreamTableEnvironmentImpl.lookupExecutor(AbstractStreamTableEnvironmentImpl.java:108)
at org.apache.flink.table.api.bridge.java.internal.StreamTableEnvironmentImpl.create(StreamTableEnvironmentImpl.java:100)
at org.apache.flink.table.api.bridge.java.StreamTableEnvironment.create(StreamTableEnvironment.java:122)
at org.apache.flink.table.api.bridge.java.StreamTableEnvironment.create(StreamTableEnvironment.java:94)
at table.FlinkTableTest.main(FlinkTableTest.java:15)
Caused by: org.apache.flink.table.api.ValidationException: Multiple factories for identifier 'default' that implement 'org.apache.flink.table.delegation.ExecutorFactory' found in the classpath.
Ambiguous factory classes are:
org.apache.flink.table.planner.delegation.DefaultExecutorFactory
org.apache.flink.table.planner.loader.DelegateExecutorFactory
at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:553)
at org.apache.flink.table.api.bridge.internal.AbstractStreamTableEnvironmentImpl.lookupExecutor(AbstractStreamTableEnvironmentImpl.java:105)
... 4 more
Process finished with exit code 1
- flink 1.14 版本以後,之前版本
flink-table-*-blink-*
轉正。所以:
flink-table-planner-blink
=>flink-table-planner
flink-table-runtime-blink
=>flink-table-runtime
停止支援 scala 2.11,但支援 2.12
scala.version = 2.12
flinkversion = 1.15.4
-
org.apache.flink:flink-connector-hive_${scala.version}:${flink.version}
-
org.apache.flink:flink-table-api-java-bridge_${scala.version}:${flink.version}
相比 flink 1.12.6 時:
org.apache.flink:flink-table-api-java-bridge_${scala.version=2.11}:${flink.version=1.12.6}
flink-shaded-guava
模組的版本變化與包衝突問題
- 若報下列錯誤,即:版本不同引起的包衝突。
NoClassDefFoundError: org/apache/flink/shaded/guava30/com/google/common/collect/Lists
原因: flink 1.16、1.15 、1.12.6 等版本使用的 flink-shaded-guava 版本基本不一樣,且版本不相容,需要修改 cdc 中的 flink-shaded-guava 版本。
- 不同flink版本對應
flink-shaded-guava
模組的版本
- flink 1.12.6 : flink-shaded-guava
18.0-12.0
- flink 1.15.4 : flink-shaded-guava
30.1.1-jre-15.0
- flink 1.16.0 : flink-shaded-guava
30.1.1-jre-16.0
- 如果工程內沒有主動引入
org.apache.flink:flink-shaded-guava
工程,則無需關心此問題————flink-core
/flink-runtime
/flink-clients
等模組內部會預設引入正確的版本
flink 1.15.4
flink 1.12.6
MySQL JDBC Version : ≥ 8.0.16
=> ≥8.0.27
- 版本依據: Apache Flink CDC 官網
- https://github.com/apache/flink-cdc/tree/release-1.3.0 |
≥8.0.16
- https://github.com/apache/flink-cdc/tree/release-2.3.0 |
≥8.0.27
針對報錯:
Caused by: java.lang.NoSuchMethodError: com.mysql.cj.CharsetMapping.getJavaEncodingForMysqlCharset(Ljava/lang/String;)Ljava/lang/String;
如果MySQL是8.0,fink cdc 2.1 之後由
debezium
聯結器引起的問題。
- 將依賴改為8.0.21之後:
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>8.0.32</version>
</dependency>
應用程式的原始碼調整
Flink
KafkaRecordDeserializer : 不再存在/不再被支援(flink1.13.0及之後),並替換為 KafkaDeserializationSchema
,KafkaSourceBuilder
建立本物件的語法稍有變化
org.apache.flink.connector.kafka.source.reader.deserializer.KafkaRecordDeserializer
| flink-connector-kafka_2.11 : 1.12.6
- flink 1.12.6
https://github.com/apache/flink/blob/release-1.12.6/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/deserializer/KafkaRecordDeserializer.java- flink 1.12.7 : 仍存在/支援 KafkaRecordDeserializer
https://github.com/apache/flink/blob/release-1.12.7/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/deserializer/KafkaRecordDeserializer.java
- flink 1.13.0 : 不再存在/不再支援 KafkaRecordDeserializer
https://github.com/apache/flink/tree/release-1.13.0/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/deserializer
- flink 14.0
https://github.com/apache/flink/tree/release-1.14.0/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/deserializer
- flink 1.15.4
https://github.com/apache/flink/tree/release-1.15.4/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/deserializer/KafkaRecordDeserializationSchema.java
- flink-connector-kafka : 3.0.0 | 瞭解即可,暫無需被此工程干擾上面思路
https://github.com/apache/flink-connector-kafka/blob/v3.0.0/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/deserializer/KafkaRecordDeserializationSchema.java
- 改造原因、改造思路
在 Apache Flink 1.13.0起,
KafkaRecordDeserializer
已被棄用、並被移除。
如果你正在使用的是Flink的舊版本,並且你看到了KafkaRecordDeserializer
的提示,你應該將其替換為使用KafkaDeserializationSchema
【推薦】或KafkaDeserializer
。
KafkaDeserializationSchema
相比KafkaRecordDeserializer
,多了需要強制實現的2個方法:
boolean isEndOfStream(T var1)
: 預設返回 false 即可T deserialize(ConsumerRecord<byte[], byte[]> var1)
: 老方法void deserialize(ConsumerRecord<byte[], byte[]> message, Collector<T> out)
內部呼叫的即本方法
// flink 1.15.4
//org.apache.flink.streaming.connectors.kafka.KafkaDeserializationSchema
package org.apache.flink.streaming.connectors.kafka;
import java.io.Serializable;
import org.apache.flink.annotation.PublicEvolving;
import org.apache.flink.api.common.serialization.DeserializationSchema;
import org.apache.flink.api.java.typeutils.ResultTypeQueryable;
import org.apache.flink.util.Collector;
import org.apache.kafka.clients.consumer.ConsumerRecord;
@PublicEvolving
public interface KafkaDeserializationSchema<T> extends Serializable, ResultTypeQueryable<T> {
default void open(DeserializationSchema.InitializationContext context) throws Exception {
}
boolean isEndOfStream(T var1);
T deserialize(ConsumerRecord<byte[], byte[]> var1) throws Exception;//方法1
default void deserialize(ConsumerRecord<byte[], byte[]> message, Collector<T> out) throws Exception {//方法2
T deserialized = this.deserialize(message);// 複用/呼叫的方法1
if (deserialized != null) {
out.collect(deserialized);
}
}
}
故新適配新增的
T deserialize(ConsumerRecord<byte[], byte[]> var1)
方法是很容易的:
import com.xxx.StringUtils;
import org.apache.flink.api.common.serialization.DeserializationSchema;
import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.api.java.typeutils.TupleTypeInfo;
//import org.apache.flink.connector.kafka.source.reader.deserializer.KafkaRecordDeserializer;
import org.apache.flink.streaming.connectors.kafka.KafkaDeserializationSchema;
import org.apache.flink.util.Collector;
import org.apache.kafka.clients.consumer.ConsumerRecord;
//public class MyKafkaRecordDeserializer implements KafkaRecordDeserializer<Tuple2<String, String>> {
public class MyKafkaRecordDeserializer implements KafkaDeserializationSchema<Tuple2<String, String>> {
/* @Override
public void open(DeserializationSchema.InitializationContext context) throws Exception {
KafkaDeserializationSchema.super.open(context);
}*/
@Override
public boolean isEndOfStream(Tuple2<String, String> stringStringTuple2) {
return false;
}
@Override
public Tuple2<String, String> deserialize(ConsumerRecord<byte[], byte[]> consumerRecord) throws Exception {//適配新方法1 | 強制
if(consumerRecord.key() == null){
return new Tuple2<>("null", StringUtils.bytesToHexString(consumerRecord.value()) );
}
return new Tuple2<>( new String(consumerRecord.key() ) , StringUtils.bytesToHexString(consumerRecord.value() ) );
}
// @Override
// public void deserialize(ConsumerRecord<byte[], byte[]> consumerRecord, Collector<Tuple2<String, String>> collector) throws Exception {//適配老方法2 | 非強制
// collector.collect(new Tuple2<>(consumerRecord.key() == null ? "null" : new String(consumerRecord.key()), StringUtils.bytesToHexString(consumerRecord.value())));
// }
@Override
public TypeInformation<Tuple2<String, String>> getProducedType() {
return new TupleTypeInfo<>(BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO);
}
}
使用本類、建立本類物件的方式,也稍有變化:
// org.apache.flink.connector.kafka.source.KafkaSourceBuilder | flink-connector-kafka:1.15.4
KafkaSourceBuilder<Tuple2<String, String>> kafkaConsumerSourceBuilder = KafkaSource.<Tuple2<String, String>>builder()
.setTopics(canTopic)
.setProperties(kafkaConsumerProperties)
.setClientIdPrefix(Constants.JOB_NAME + "#" + System.currentTimeMillis() + "")
.setDeserializer( KafkaRecordDeserializationSchema.of(new MyKafkaRecordDeserializer()) ); // flink 1.15.4
//.setDeserializer(new MyKafkaRecordDeserializer());// flink 1.12.6
- 推薦文獻
- Flink1.14新版KafkaSource和KafkaSink實踐使用(自定義反序列化器、Topic選擇器、序列化器、分割槽器) - CSDN
- https://nightlies.apache.org/flink/flink-docs-release-1.15/zh/docs/connectors/datastream/kafka/
Flink Cdc : flink cdc 2.0.0 之後,【groupId、包路徑】 從 com.alibaba.ververica
變為 com.ververica
MySQLSource : 包路徑被調整(2.0.0及之後)、類名大小寫有變化(flink cdc 2.0.0 及之後)、不再被推薦使用(flink cdc 2.1.0 及之後)
com.alibaba.ververica.cdc.connectors.mysql.MySQLSource
| flink cdc 1.3.0
https://github.com/apache/flink-cdc/blob/release-1.3.0/flink-connector-mysql-cdc/src/main/java/com/alibaba/ververica/cdc/connectors/mysql/MySQLSource.java
包路徑被調整、類名大小寫有變化https://github.com/apache/flink-cdc/blob/release-2.0.0/flink-connector-mysql-cdc/src/main/java/com/ververica/cdc/connectors/mysql/MySqlSource.java
com.ververica.cdc.connectors.mysql.MySqlSource
自 flink cdc 2.1.0 及之後被建議棄用、但com.ververica.cdc.connectors.mysql.source.MySqlSource
被推薦可用
https://github.com/apache/flink-cdc/blob/release-2.1.0/flink-connector-mysql-cdc/src/main/java/com/ververica/cdc/connectors/mysql/MySqlSource.java
Flink CDC這個MySqlSource棄用了,還有別的方式嗎? - aliyun 【推薦】有兩個MysqlSource,一個是棄用的,另一個是可用的,包名不同。
com.ververica.cdc.connectors.mysql.source
這個包下的是可用的。
com.ververica.cdc.connectors.mysql.source.MySqlSource
| flink cdc 2.3.0https://github.com/apache/flink-cdc/blob/release-2.3.0/flink-connector-mysql-cdc/src/main/java/com/ververica/cdc/connectors/mysql/MySqlSource.java
serverId : 如果選擇新的MySqlSource
類,則:其設定入參稍有變化
com.alibaba.ververica.cdc.connectors.mysql.MySQLSource#serverId()
| flink cdc 1.3.0
https://github.com/apache/flink-cdc/blob/release-1.3.0/flink-connector-mysql-cdc/src/main/java/com/alibaba/ververica/cdc/connectors/mysql/MySQLSource.java
com.ververica.cdc.connectors.mysql.source.MySqlSource
| flink cdc 2.1.0 、 2.3.0 【被推薦使用】
https://github.com/apache/flink-cdc/blob/release-2.1.0/flink-connector-mysql-cdc/src/main/java/com/ververica/cdc/connectors/mysql/source/MySqlSource.java
沒有
serverId
方法
https://github.com/apache/flink-cdc/blob/release-2.1.0/flink-connector-mysql-cdc/src/main/java/com/ververica/cdc/connectors/mysql/source/MySqlSourceBuilder.java
有serverId
方法,透過MySqlSource.<String>builder()
即MySqlSourceBuilder
/**
* A numeric ID or a numeric ID range of this database client, The numeric ID syntax is like
* '5400', the numeric ID range syntax is like '5400-5408', The numeric ID range syntax is
* required when 'scan.incremental.snapshot.enabled' enabled. Every ID must be unique across all
* currently-running database processes in the MySQL cluster. This connector joins the MySQL
* cluster as another server (with this unique ID) so it can read the binlog. By default, a
* random number is generated between 5400 and 6400, though we recommend setting an explicit
* value."
*/
public MySqlSourceBuilder<T> serverId(String serverId) {
this.configFactory.serverId(serverId);
return this;
}
com.ververica.cdc.connectors.mysql.source.MySqlSource#serverId(int serverId)
| flink cdc 2.1.0 【被建議棄用】、flink cdc 2.3.0 【被廢止/無法用】
https://github.com/apache/flink-cdc/blob/release-2.1.0/flink-connector-mysql-cdc/src/main/java/com/ververica/cdc/connectors/mysql/MySqlSource.java
/**
* A numeric ID of this database client, which must be unique across all currently-running
* database processes in the MySQL cluster. This connector joins the MySQL database cluster
* as another server (with this unique ID) so it can read the binlog. By default, a random
* number is generated between 5400 and 6400, though we recommend setting an explicit value.
*/
public Builder<T> serverId(int serverId) {
this.serverId = serverId;
return this;
}
- 改造Demo: flink 1.3.0
SourceFunction<String> mySqlSource =
MySqlSource.<String>builder()
//資料庫地址
.hostname(jobParameterTool.get("cdc.mysql.hostname"))
//埠號
.port(Integer.parseInt(jobParameterTool.get("cdc.mysql.port")))
//使用者名稱
.username(jobParameterTool.get("cdc.mysql.username"))
//密碼
.password(jobParameterTool.get("cdc.mysql.password"))
//監控的資料庫
.databaseList(jobParameterTool.get("cdc.mysql.databaseList"))
//監控的表名,格式資料庫.表名
.tableList(jobParameterTool.get("cdc.mysql.tableList"))
//虛擬化方式
.deserializer(new MySQLCdcMessageDeserializationSchema())
//時區
.serverTimeZone("UTC")
.serverId( randomServerId(5000, Constants.JOB_NAME + "#xxxConfig") )
.startupOptions(StartupOptions.latest())
.build();
public static Integer randomServerId(int interval, String jobCdcConfigDescription){
//startServerId ∈[ interval + 0, interval + interval)
//int serverId = RANDOM.nextInt(interval) + interval; // RANDOM.nextInt(n) : 生成介於 [0,n) 區間的隨機整數
//serverId = [ 7000 + 0, Integer.MAX_VALUE - interval)
int serverId = RANDOM.nextInt(Integer.MAX_VALUE - interval - 7000) + 7000;
log.info("Success to generate random server id result! serverId : {}, interval : {}, jobCdcConfigDescription : {}"
, serverId , interval , jobCdcConfigDescription );
return serverId;
}
- 改造Demo: flink 2.3.0
MySqlSource<String> mySqlSource =
MySqlSource.<String>builder()
//資料庫地址
.hostname(jobParameterTool.get("cdc.mysql.hostname"))
//埠號
.port(Integer.parseInt(jobParameterTool.get("cdc.mysql.port")))
//使用者名稱
.username(jobParameterTool.get("cdc.mysql.username"))
//密碼
.password(jobParameterTool.get("cdc.mysql.password"))
//監控的資料庫
.databaseList(jobParameterTool.get("cdc.mysql.databaseList"))
//監控的表名,格式資料庫.表名
.tableList(jobParameterTool.get("cdc.mysql.tableList"))
//虛擬化方式
.deserializer(new MySQLCdcMessageDeserializationSchema())
//時區
.serverTimeZone("UTC")
.serverId( randomServerIdRange(5000, Constants.JOB_NAME + "#xxxConfig") )
.startupOptions(StartupOptions.latest())
.build();
//新增強制要求: interval >= 本運算元的並行度
public static String randomServerIdRange(int interval, String jobCdcConfigDescription){
// 生成1個起始隨機數 |
//startServerId = [interval + 0, interval + interval )
//int startServerId = RANDOM.nextInt(interval) + interval; // RANDOM.nextInt(n) : 生成介於 [0,n) 區間的隨機整數
//startServerId = [ 7000 + 0, Integer.MAX_VALUE - interval)
int startServerId = RANDOM.nextInt(Integer.MAX_VALUE - interval - 7000) + 7000;
//endServerId ∈ [startServerId, startServerId + interval];
int endServerId = startServerId + interval;
log.info("Success to generate random server id result! startServerId : {},endServerId : {}, interval : {}, jobCdcConfigDescription : {}"
, startServerId, endServerId , interval , jobCdcConfigDescription );
return String.format("%d-%d", startServerId, endServerId);
}
MySQLSourceBuilder#build 方法: 返回型別存在變化: SourceFunction/DebeziumSourceFunction<T>
=> MySqlSource<T>
org.apache.flink.streaming.api.functions.source.SourceFunction
=>com.ververica.cdc.connectors.mysql.source.MySqlSource
//com.alibaba.ververica.cdc.connectors.mysql.MySQLSource.Builder#build | flink cdc 1.3.0
// 返回: com.alibaba.ververica.cdc.debezium.DebeziumSourceFunction
// public class DebeziumSourceFunction<T> extends RichSourceFunction<T> implements CheckpointedFunction, CheckpointListener, ResultTypeQueryable<T>
//public abstract class org.apache.flink.streaming.api.functions.source.RichSourceFunction<OUT> extends AbstractRichFunction implements SourceFunction<OUT>
public DebeziumSourceFunction<T> build() {
Properties props = new Properties();
props.setProperty("connector.class", MySqlConnector.class.getCanonicalName());
props.setProperty("database.server.name", "mysql_binlog_source");
props.setProperty("database.hostname", (String)Preconditions.checkNotNull(this.hostname));
props.setProperty("database.user", (String)Preconditions.checkNotNull(this.username));
props.setProperty("database.password", (String)Preconditions.checkNotNull(this.password));
props.setProperty("database.port", String.valueOf(this.port));
props.setProperty("database.history.skip.unparseable.ddl", String.valueOf(true));
if (this.serverId != null) {
props.setProperty("database.server.id", String.valueOf(this.serverId));
}
...
}
//com.ververica.cdc.connectors.mysql.source.MySqlSourceBuilder#build | flink cdc 2.3.0
//// 返回:
public MySqlSource<T> build() {
return new MySqlSource(this.configFactory, (DebeziumDeserializationSchema)Preconditions.checkNotNull(this.deserializer));
}
- 使用變化Demo: Flink cdc 1.3.0
mysqlSource 想要監聽 mysql 表結構變更(例如:新增新的欄位),要怎麼辦?設定 - aliyun
Properties properties = new Properties();
properties.setProperty("database.hostname", "localhost");
properties.setProperty("database.port", "3306");
properties.setProperty("database.user", "your_username");
properties.setProperty("database.password", "your_password");
properties.setProperty("database.server.id", "1"); // 設定唯一的 server id
properties.setProperty("database.server.name", "mysql_source");
DebeziumSourceFunction<String> sourceFunction = MySQLSource.<String>builder()
.hostname("localhost")
.port(3306)
.username("your_username")
.password("your_password")
.databaseList("your_database")
.tableList("your_table")
.includeSchemaChanges(true) // 開啟監聽表結構變更
.deserializer(new StringDebeziumDeserializationSchema())
.build();
DataStreamSource<String> stream = env.addSource(sourceFunction);//可以使用 addSource
stream.print();
env.execute("MySQL CDC Job");
- 使用變化Demo: Flink cdc 2.3.0
https://flink-tpc-ds.github.io/flink-cdc-connectors/release-2.3/content/connectors/mysql-cdc(ZH).html
無法使用env.addSource(SourceFunction, String sourceName)
,只能使用env.fromSource(Source<OUT, ?, ?> source, WatermarkStrategy<OUT> timestampsAndWatermarks, String sourceName)
import org.apache.flink.api.common.eventtime.WatermarkStrategy;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import com.ververica.cdc.debezium.JsonDebeziumDeserializationSchema;
import com.ververica.cdc.connectors.mysql.source.MySqlSource;
public class MySqlSourceExample {
public static void main(String[] args) throws Exception {
MySqlSource<String> mySqlSource = MySqlSource.<String>builder()
.hostname("yourHostname")
.port(yourPort)
.databaseList("yourDatabaseName") // 設定捕獲的資料庫, 如果需要同步整個資料庫,請將 tableList 設定為 ".*".
.tableList("yourDatabaseName.yourTableName") // 設定捕獲的表
.username("yourUsername")
.password("yourPassword")
.deserializer(new JsonDebeziumDeserializationSchema()) // 將 SourceRecord 轉換為 JSON 字串
.build();
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// 設定 3s 的 checkpoint 間隔
env.enableCheckpointing(3000);
env
.fromSource(mySqlSource, WatermarkStrategy.noWatermarks(), "MySQL Source")
// 設定 source 節點的並行度為 4
.setParallelism(4)
.print().setParallelism(1); // 設定 sink 節點並行度為 1
env.execute("Print MySQL Snapshot + Binlog");
}
}
StartupOptions : 包路徑被調整(2.0.0及之後)
import com.alibaba.ververica.cdc.connectors.mysql.table.StartupOptions
| flink 1.3.0
https://github.com/apache/flink-cdc/blob/release-1.3.0/flink-connector-mysql-cdc/src/main/java/com/alibaba/ververica/cdc/connectors/mysql/table/StartupOptions.java
com.ververica.cdc.connectors.mysql.table.StartupOptions
| flink 2.3.0https://github.com/apache/flink-cdc/blob/release-2.3.0/flink-connector-mysql-cdc/src/main/java/com/ververica/cdc/connectors/mysql/table/StartupOptions.java
DebeziumDeserializationSchema : 包路徑被調整(flink-cdc2.0.0及之後)
com.alibaba.ververica.cdc.debezium.DebeziumDeserializationSchema
| flink cdc 1.3.0
com.ververica:flink-connector-debezium:1.3.0
https://github.com/apache/flink-cdc/blob/release-1.3.0/flink-connector-debezium/src/main/java/com/alibaba/ververica/cdc/debezium/DebeziumDeserializationSchema.java
com.ververica.cdc.debezium.DebeziumDeserializationSchema
| flink cdc 2.3.0
com.ververica:flink-connector-debezium:2.3.0
https://github.com/apache/flink-cdc/blob/release-2.3.0/flink-connector-debezium/src/main/java/com/ververica/cdc/debezium/DebeziumDeserializationSchema.java
X 參考文獻
- Flink+Flink CDC版本升級的依賴問題總結 - CSDN
- apache flink cdc
- https://github.com/apache/flink-cdc
- https://github.com/apache/flink-cdc/blob/master/docs/content/docs/faq/faq.md
- https://github.com/apache/flink-cdc/tree/master/flink-cdc-connect/flink-cdc-source-connectors/flink-sql-connector-mysql-cdc
com.alibaba.ververica:flink-connector-mysql-cdc:1.3.0
https://github.com/apache/flink-cdc/blob/release-1.3.0/flink-connector-mysql-cdc/pom.xml 【推薦】 Flink 1.12.6
com.ververica:flink-connector-mysql-cdc:2.0
MYSQL (Database: 5.7, 8.0.x / JDBC Driver: 8.0.16 ) | Flink 1.12 + | JDK 8+
https://github.com/apache/flink-cdc/tree/release-2.0
https://github.com/apache/flink-cdc/blob/release-2.0/flink-connector-mysql-cdc/pom.xml
com.ververica:flink-connector-mysql-cdc:2.3.0
https://github.com/apache/flink-cdc/blob/release-2.3.0/flink-connector-mysql-cdc/pom.xml 【推薦】 Flink 1.15.4
org.apache.flink:flink-connector-mysql-cdc:${flink.cdc.version}
- https://ververica.github.io/flink-cdc-connectors/master/content/about.html [已廢止]
- apache flink
- https://github.com/apache/flink
- https://flink.apache.org
- apache flink-connector-kafka
- https://github.com/apache/flink-connector-kafka
- https://flink.apache.org