pulsar sink clickhouse

cafebabe發表於2021-02-03
  • 目標 pulsar topic 資料 sink 到clickhouse

建立下沉資料庫

create table pulsar_clickhouse_jdbc_sink
(
    userId UInt64,
    score UInt8,
    EventDate Date
)
engine = MergeTree()
    PARTITION BY toYYYYMM(EventDate)
    ORDER BY (EventDate,intHash32(userId))
        SAMPLE BY intHash32(userId);

建立 clickhouse sink config 配置檔案

configs:
        userName: "default"
        password: ""
        jdbcUrl: "jdbc:clickhouse://localhost:8123/pulsar_clickhouse_jdbc_sink"
        tableName: "pulsar_clickhouse_jdbc_sink"

在 pulsar 上建立 sink connector

本地啟動 clickhouse sink

./pulsar-admin sinks localrun  \
--archive /root/apache-pulsar-2.7.0/connectors/pulsar-io-jdbc-clickhouse-2.7.0.nar   \
--tenant public  \
--namespace default \
--name sink-test-clickhouse  \
--inputs test_clickhouse  \
--sink-config-file  /root/apache-pulsar-2.7.0/bin/clickhouse.yaml 

如果找不到 sink type name需要去 archive.apache.org/dist/pulsar/pul... 下載相應的 connector 安裝到 pulsar 根目錄下的 connectors 目錄下面

開啟Schema 自動註冊

定義pojo

./pulsar-admin namespaces  set-is-allow-auto-update-schema --enable public/default

pulsar 釋出訊息到topic

package com.example.demo.utils;

import lombok.Data;

import java.io.Serializable;
import java.util.Date ;

@Data
public class UserEvent implements Serializable {
    Long userId;
    Integer score;
    Date EventDate;
}

傳送資料到 topic 檢視是否sink到 clickhouse

  PulsarClient client = PulsarClient.builder()
                .serviceUrl("pulsar://localhost:6650")
                .build();
        var ch= JSONSchema.of(UserEvent.class);
        System.out.println("定義");
        System.out.println(ch.getSchemaInfo().getSchemaDefinition());
        Producer<UserEvent> producer = client.newProducer(ch)
                .enableBatching(true)
                .topic("test_clickhouse")
                .create();

        UserEvent ue=new UserEvent();
        ue.setUserId(1L);
        ue.setScore(20);
        ue.setEventDate(Calendar.getInstance().getTime());
        producer.send(ue);
select * from pulsar_clickhouse_jdbc_sink.pulsar_clickhouse_jdbc_sink;
SELECT *
FROM pulsar_clickhouse_jdbc_sink.pulsar_clickhouse_jdbc_sink

Query id: 635f4ae8-00c0-401c-914d-ab47ccc5a718

┌─userId─┬─score─┬──EventDate─┐
│      1301970-01-01 │
└────────┴───────┴────────────┘
┌─userId─┬─score─┬──EventDate─┐
│      1201970-01-01 │
└────────┴───────┴────────────┘

2 rows in set. Elapsed: 0.256 sec.

時間上還有問題 暫時先這樣 pulsar sink clickhouse 冒煙測試 OK

本作品採用《CC 協議》,轉載必須註明作者和本文連結

相關文章