目錄(腦圖)
ClickHouse PaaS 雲原生多租戶平臺(Altinity.Cloud)
PaaS 架構概覽
設計一個擁有云原生編排能力、支援多雲環境部署、自動化運維、彈性擴縮容、故障自愈等特性,同時提供租戶隔離、許可權管理、操作審計等企業級能力的高效能、低成本的分散式中介軟體服務是真挺難的。
SaaS 模式交付給使用者
Sentry Snuba 事件大資料分析引擎架構概覽
Snuba 是一個在 Clickhouse 基礎上提供豐富資料模型、快速攝取消費者和查詢最佳化器的服務。以搜尋和提供關於 Sentry 事件資料的聚合引擎。
資料完全儲存在 Clickhouse 表和物化檢視中,它透過輸入流(目前只有 Kafka 主題)攝入,可以透過時間點查詢或流查詢(訂閱)進行查詢。
文件:
Kubernetes ClickHouse Operator
什麼是 Kubernetes Operator?
Kubernetes Operator 是一種封裝、部署和管理 Kubernetes 應用的方法。我們使用 Kubernetes API(應用程式設計介面)和 kubectl 工具在 Kubernetes 上部署並管理 Kubernetes 應用。
Altinity Operator for ClickHouse
Altinity:ClickHouse Operator 業界領先開源提供商。
- Altinity:https://altinity.com/
- GitHub:https://github.com/Altinity/clickhouse-operator
- Youtube:https://www.youtube.com/@Altinity
當然這種多租戶隔離的 ClickHouse 中介軟體 PaaS 雲平臺,公司或雲廠商幾乎是不開源的。
RadonDB ClickHouse
- https://github.com/radondb/radondb-clickhouse-operator
- https://github.com/radondb/radondb-clickhouse-kubernetes
雲廠商(青雲)基於 altinity-clickhouse-operator 定製的。對於快速部署生產叢集做了些最佳化。
Helm + Operator 快速上雲 ClickHouse 叢集
雲原生實驗環境
-
VKE K8S Cluster,
Vultr
託管叢集(v1.23.14)
-
Kubesphere v3.3.1 叢集視覺化管理,全棧的 Kubernetes 容器雲 PaaS 解決方案。
-
Longhorn 1.14,Kubernetes 的雲原生分散式塊儲存。
部署 clickhouse-operator
這裡我們使用 RadonDB 定製的 Operator。
values.operator.yaml
定製如下兩個引數:
# operator 監控叢集所有 namespace 的 clickhouse 部署
watchAllNamespaces: true
# 啟用 operator 指標監控
enablePrometheusMonitor: true
- helm 部署 operator:
cd vip-k8s-paas/10-cloud-native-clickhouse
# 部署在 kube-system
helm install clickhouse-operator ./clickhouse-operator -f values.operator.yaml -n kube-system
kubectl -n kube-system get po | grep clickhouse-operator
# clickhouse-operator-6457c6dcdd-szgpd 1/1 Running 0 3m33s
kubectl -n kube-system get svc | grep clickhouse-operator
# clickhouse-operator-metrics ClusterIP 10.110.129.244 <none> 8888/TCP 4m18s
kubectl api-resources | grep clickhouse
# clickhouseinstallations chi clickhouse.radondb.com/v1 true ClickHouseInstallation
# clickhouseinstallationtemplates chit clickhouse.radondb.com/v1 true ClickHouseInstallationTemplate
# clickhouseoperatorconfigurations chopconf clickhouse.radondb.com/v1 true ClickHouseOperatorConfiguration
部署 clickhouse-cluster
這裡我們使用 RadonDB 定製的 clickhouse-cluster helm charts。
快速部署 2 shards + 2 replicas + 3 zk nodes 的叢集。
values.cluster.yaml
定製:
clickhouse:
clusterName: snuba-clickhouse-nodes
shardscount: 2
replicascount: 2
...
zookeeper:
install: true
replicas: 3
- helm 部署 clickhouse-cluster:
kubectl create ns cloud-clickhouse
helm install clickhouse ./clickhouse-cluster -f values.cluster.yaml -n cloud-clickhouse
kubectl get po -n cloud-clickhouse
# chi-clickhouse-snuba-ck-nodes-0-0-0 3/3 Running 5 (6m13s ago) 16m
# chi-clickhouse-snuba-ck-nodes-0-1-0 3/3 Running 1 (5m33s ago) 6m23s
# chi-clickhouse-snuba-ck-nodes-1-0-0 3/3 Running 1 (4m58s ago) 5m44s
# chi-clickhouse-snuba-ck-nodes-1-1-0 3/3 Running 1 (4m28s ago) 5m10s
# zk-clickhouse-0 1/1 Running 0 17m
# zk-clickhouse-1 1/1 Running 0 17m
# zk-clickhouse-2 1/1 Running 0 17m
藉助 Operator 快速擴充套件 clickhouse 分片叢集
- 使用如下命令,將
shardsCount
改為3
:
kubectl edit chi/clickhouse -n cloud-clickhouse
- 檢視 pods:
kubectl get po -n cloud-clickhouse
# NAME READY STATUS RESTARTS AGE
# chi-clickhouse-snuba-ck-nodes-0-0-0 3/3 Running 5 (24m ago) 34m
# chi-clickhouse-snuba-ck-nodes-0-1-0 3/3 Running 1 (23m ago) 24m
# chi-clickhouse-snuba-ck-nodes-1-0-0 3/3 Running 1 (22m ago) 23m
# chi-clickhouse-snuba-ck-nodes-1-1-0 3/3 Running 1 (22m ago) 23m
# chi-clickhouse-snuba-ck-nodes-2-0-0 3/3 Running 1 (108s ago) 2m33s
# chi-clickhouse-snuba-ck-nodes-2-1-0 3/3 Running 1 (72s ago) 119s
# zk-clickhouse-0 1/1 Running 0 35m
# zk-clickhouse-1 1/1 Running 0 35m
# zk-clickhouse-2 1/1 Running 0 35m
發現多出 chi-clickhouse-snuba-ck-nodes-2-0-0
與 chi-clickhouse-snuba-ck-nodes-2-1-0
。 分片與副本已自動由 Operator
新建。
小試牛刀
ReplicatedMergeTree+Distributed+Zookeeper 構建多分片多副本叢集
連線 clickhouse
我們進入 Pod, 使用原生命令列客戶端 clickhouse-client
連線。
kubectl exec -it chi-clickhouse-snuba-ck-nodes-0-0-0 -n cloud-clickhouse -- bash
kubectl exec -it chi-clickhouse-snuba-ck-nodes-0-1-0 -n cloud-clickhouse -- bash
kubectl exec -it chi-clickhouse-snuba-ck-nodes-1-0-0 -n cloud-clickhouse -- bash
kubectl exec -it chi-clickhouse-snuba-ck-nodes-1-1-0 -n cloud-clickhouse -- bash
kubectl exec -it chi-clickhouse-snuba-ck-nodes-2-0-0 -n cloud-clickhouse -- bash
kubectl exec -it chi-clickhouse-snuba-ck-nodes-2-1-0 -n cloud-clickhouse -- bash
我們直接透過終端分別進入這 6 個 pod。然後進行測試:
clickhouse-client --multiline -u username -h ip --password passowrd
# clickhouse-client -m
建立分散式資料庫
- 檢視
system.clusters
select * from system.clusters;
2.建立名為 test
的資料庫
create database test on cluster 'snuba-ck-nodes';
# 刪除:drop database test on cluster 'snuba-ck-nodes';
- 在各個節點檢視,都已存在
test
資料庫。
show databases;
建立本地表(ReplicatedMergeTree)
- 建表語句如下:
在叢集中各個節點 test
資料庫中建立 t_local
本地表,採用 ReplicatedMergeTree
表引擎,接受兩個引數:
zoo_path
— zookeeper 中表的路徑,針對表同一個分片的不同副本,定義相同路徑。- '/clickhouse/tables/{shard}/test/t_local'
replica_name
— zookeeper 中表的副本名稱
CREATE TABLE test.t_local on cluster 'snuba-ck-nodes'
(
EventDate DateTime,
CounterID UInt32,
UserID UInt32
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/test/t_local', '{replica}')
PARTITION BY toYYYYMM(EventDate)
ORDER BY (CounterID, EventDate, intHash32(UserID))
SAMPLE BY intHash32(UserID);
- 宏(
macros
)佔位符:
建表語句引數包含的宏替換佔位符(如:{replica}
)。會被替換為配置檔案裡 macros 部分的值。
檢視叢集中 clickhouse 分片&副本節點 configmap
:
kubectl get configmap -n cloud-clickhouse | grep clickhouse
NAME DATA AGE
chi-clickhouse-common-configd 6 20h
chi-clickhouse-common-usersd 6 20h
chi-clickhouse-deploy-confd-snuba-ck-nodes-0-0 2 20h
chi-clickhouse-deploy-confd-snuba-ck-nodes-0-1 2 20h
chi-clickhouse-deploy-confd-snuba-ck-nodes-1-0 2 20h
chi-clickhouse-deploy-confd-snuba-ck-nodes-1-1 2 20h
chi-clickhouse-deploy-confd-snuba-ck-nodes-2-0 2 19h
chi-clickhouse-deploy-confd-snuba-ck-nodes-2-1 2 19h
檢視節點配置值:
kubectl describe configmap chi-clickhouse-deploy-confd-snuba-ck-nodes-0-0 -n cloud-clickhouse
建立對應的分散式表(Distributed)
CREATE TABLE test.t_dist on cluster 'snuba-ck-nodes'
(
EventDate DateTime,
CounterID UInt32,
UserID UInt32
)
ENGINE = Distributed('snuba-ck-nodes', test, t_local, rand());
# drop table test.t_dist on cluster 'snuba-ck-nodes';
這裡,Distributed 引擎的所用的四個引數:
- cluster - 服務為配置中的叢集名(
snuba-ck-nodes
) - database - 遠端資料庫名(
test
) - table - 遠端資料表名(
t_local
) - sharding_key - (可選) 分片key(
CounterID/rand()
)
檢視相關表,如:
use test;
show tables;
# t_dist
# t_local
透過分散式表插入幾條資料:
# 插入
INSERT INTO test.t_dist VALUES ('2022-12-16 00:00:00', 1, 1),('2023-01-01 00:00:00',2, 2),('2023-02-01 00:00:00',3, 3);
任一節點查詢資料:
select * from test.t_dist;
實戰,為 Snuba 引擎提供 ClickHouse PaaS
拆解與分析 Sentry Helm Charts
在我們遷移到 Kubernetes Operator 之前,我們先拆解與分析下 sentry-charts 中自帶的 clickhouse & zookeeper charts。
非官方 Sentry Helm Charts:
他的 Chart.yaml
如下:
apiVersion: v2
appVersion: 22.11.0
dependencies:
- condition: sourcemaps.enabled
name: memcached
repository: https://charts.bitnami.com/bitnami
version: 6.1.5
- condition: redis.enabled
name: redis
repository: https://charts.bitnami.com/bitnami
version: 16.12.1
- condition: kafka.enabled
name: kafka
repository: https://charts.bitnami.com/bitnami
version: 16.3.2
- condition: clickhouse.enabled
name: clickhouse
repository: https://sentry-kubernetes.github.io/charts
version: 3.2.0
- condition: zookeeper.enabled
name: zookeeper
repository: https://charts.bitnami.com/bitnami
version: 9.0.0
- alias: rabbitmq
condition: rabbitmq.enabled
name: rabbitmq
repository: https://charts.bitnami.com/bitnami
version: 8.32.2
- condition: postgresql.enabled
name: postgresql
repository: https://charts.bitnami.com/bitnami
version: 10.16.2
- condition: nginx.enabled
name: nginx
repository: https://charts.bitnami.com/bitnami
version: 12.0.4
description: A Helm chart for Kubernetes
maintainers:
- name: sentry-kubernetes
name: sentry
type: application
version: 17.9.0
這個 sentry-charts 將所有中介軟體 helm charts 耦合依賴在一起部署,不適合 sentry 微服務 & 中介軟體叢集擴充套件。更高階的做法是每個中介軟體擁有定製的 Kubernetes Operator(如:clickhouse-operator
) & 獨立的 K8S 叢集,形成中介軟體 PaaS 平臺對外提供服務。
這裡我們拆分中介軟體 charts 到獨立的 namespace 或單獨的叢集運維。設計為:
- ZooKeeper 名稱空間:
cloud-zookeeper-paas
- ClickHouse 名稱空間:
cloud-clickhouse-paas
獨立部署 ZooKeeper Helm Chart
這裡 zookeeper chart 採用的是 bitnami/zookeeper,他的倉庫地址如下:
- https://github.com/bitnami/charts/tree/master/bitnami/zookeeper
- https://github.com/bitnami/containers/tree/main/bitnami/zookeeper
- ZooKeeper Operator 會在後續文章專項討論。
- 建立名稱空間:
kubectl create ns cloud-zookeeper-paas
- 簡單定製下
values.yaml
:
# 暴露下 prometheus 監控所需的服務
metrics:
containerPort: 9141
enabled: true
....
....
service:
annotations: {}
clusterIP: ""
disableBaseClientPort: false
externalTrafficPolicy: Cluster
extraPorts: []
headless:
annotations: {}
publishNotReadyAddresses: true
loadBalancerIP: ""
loadBalancerSourceRanges: []
nodePorts:
client: ""
tls: ""
ports:
client: 2181
election: 3888
follower: 2888
tls: 3181
sessionAffinity: None
type: ClusterIP
注意:在使用支援外部負載均衡器的雲提供商的服務時,需設定 Sevice 的 type 的值為 "LoadBalancer", 將為 Service 提供負載均衡器。來自外部負載均衡器的流量將直接重定向到後端 Pod 上,不過實際它們是如何工作的,這要依賴於雲提供商。
- helm 部署:
helm install zookeeper ./zookeeper -f values.yaml -n cloud-zookeeper-paas
叢集內,可使用 zookeeper.cloud-zookeeper-paas.svc.cluster.local:2181
對外提供服務。
- zkCli 連線 ZooKeeper:
export POD_NAME=$(kubectl get pods --namespace cloud-zookeeper-paas -l "app.kubernetes.io/name=zookeeper,app.kubernetes.io/instance=zookeeper,app.kubernetes.io/component=zookeeper" -o jsonpath="{.items[0].metadata.name}")
kubectl -n cloud-zookeeper-paas exec -it $POD_NAME -- zkCli.sh
# test
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 1] ls /zookeeper
[config, quota]
[zk: localhost:2181(CONNECTED) 2] quit
# 外部訪問
# kubectl port-forward --namespace cloud-zookeeper-paas svc/zookeeper 2181: & zkCli.sh 127.0.0.1:2181
- 檢視
zoo.cfg
kubectl -n cloud-zookeeper-paas exec -it $POD_NAME -- cat /opt/bitnami/zookeeper/conf/zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/bitnami/zookeeper/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
autopurge.purgeInterval=0
## Metrics Providers
#
# https://prometheus.io Metrics Exporter
metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider
#metricsProvider.httpHost=0.0.0.0
metricsProvider.httpPort=9141
metricsProvider.exportJvmInfo=true
preAllocSize=65536
snapCount=100000
maxCnxns=0
reconfigEnabled=false
quorumListenOnAllIPs=false
4lw.commands.whitelist=srvr, mntr, ruok
maxSessionTimeout=40000
admin.serverPort=8080
admin.enableServer=true
server.1=zookeeper-0.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local:2888:3888;2181
server.2=zookeeper-1.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local:2888:3888;2181
server.3=zookeeper-2.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local:2888:3888;2181
獨立部署 ClickHouse Helm Chart
這裡 clickhouse chart 採用的是 sentry-kubernetes/charts 自己維護的一個版本:
- sentry snuba 目前對於 clickhouse 21.x 等以上版本支援的並不友好,這裡的映象版本是
yandex/clickhouse-server:20.8.19.4
。 - https://github.com/sentry-kubernetes/charts/tree/develop/clickhouse
- ClickHouse Operator + ClickHouse Keeper 會在後續文章專項討論。
這個自帶的 clickhouse-charts 存在些問題,Service 部分需簡單修改下允許配置 "type:LoadBalancer" or "type:NodePort"。
注意:在使用支援外部負載均衡器的雲提供商的服務時,需設定 Sevice 的 type 的值為 "LoadBalancer", 將為 Service 提供負載均衡器。來自外部負載均衡器的流量將直接重定向到後端 Pod 上,不過實際它們是如何工作的,這要依賴於雲提供商。
- 建立名稱空間:
kubectl create ns cloud-clickhouse-paas
- 簡單定製下
values.yaml
:
注意上面 zoo.cfg
的 3 個 zookeeper 例項的地址:
server.1=zookeeper-0.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local:2888:3888;2181
server.2=zookeeper-1.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local:2888:3888;2181
server.3=zookeeper-2.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local:2888:3888;2181
# 修改 zookeeper_servers
clickhouse:
configmap:
zookeeper_servers:
config:
- hostTemplate: 'zookeeper-0.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local'
index: clickhouse
port: "2181"
- hostTemplate: 'zookeeper-1.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local'
index: clickhouse
port: "2181"
- hostTemplate: 'zookeeper-2.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local'
index: clickhouse
port: "2181"
enabled: true
operation_timeout_ms: "10000"
session_timeout_ms: "30000"
# 暴露下 prometheus 監控所需的服務
metrics:
enabled: true
當然這裡也可以不用 Headless Service,因為是同一個叢集的不同 namespace 的內部訪問,所以也可簡單填入 ClusterIP 型別 Sevice:
# 修改 zookeeper_servers
clickhouse:
configmap:
zookeeper_servers:
config:
- hostTemplate: 'zookeeper.cloud-zookeeper-paas.svc.cluster.local'
index: clickhouse
port: "2181"
enabled: true
operation_timeout_ms: "10000"
session_timeout_ms: "30000"
# 暴露下 prometheus 監控所需的服務
metrics:
enabled: true
- helm 部署:
helm install clickhouse ./clickhouse -f values.yaml -n cloud-clickhouse-paas
- 連線 clickhouse
kubectl -n cloud-clickhouse-paas exec -it clickhouse-0 -- clickhouse-client --multiline --host="clickhouse-1.clickhouse-headless.cloud-clickhouse-paas"
- 驗證叢集
show databases;
select * from system.clusters;
select * from system.zookeeper where path = '/clickhouse';
當前 ClickHouse 叢集的 ConfigMap
kubectl get configmap -n cloud-clickhouse-paas | grep clickhouse
clickhouse-config 1 28h
clickhouse-metrica 1 28h
clickhouse-users 1 28h
clickhouse-config(config.xml
)
<yandex>
<path>/var/lib/clickhouse/</path>
<tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
<user_files_path>/var/lib/clickhouse/user_files/</user_files_path>
<format_schema_path>/var/lib/clickhouse/format_schemas/</format_schema_path>
<include_from>/etc/clickhouse-server/metrica.d/metrica.xml</include_from>
<users_config>users.xml</users_config>
<display_name>clickhouse</display_name>
<listen_host>0.0.0.0</listen_host>
<http_port>8123</http_port>
<tcp_port>9000</tcp_port>
<interserver_http_port>9009</interserver_http_port>
<max_connections>4096</max_connections>
<keep_alive_timeout>3</keep_alive_timeout>
<max_concurrent_queries>100</max_concurrent_queries>
<uncompressed_cache_size>8589934592</uncompressed_cache_size>
<mark_cache_size>5368709120</mark_cache_size>
<timezone>UTC</timezone>
<umask>022</umask>
<mlock_executable>false</mlock_executable>
<remote_servers incl="clickhouse_remote_servers" optional="true" />
<zookeeper incl="zookeeper-servers" optional="true" />
<macros incl="macros" optional="true" />
<builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
<max_session_timeout>3600</max_session_timeout>
<default_session_timeout>60</default_session_timeout>
<disable_internal_dns_cache>1</disable_internal_dns_cache>
<query_log>
<database>system</database>
<table>query_log</table>
<partition_by>toYYYYMM(event_date)</partition_by>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</query_log>
<query_thread_log>
<database>system</database>
<table>query_thread_log</table>
<partition_by>toYYYYMM(event_date)</partition_by>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</query_thread_log>
<distributed_ddl>
<path>/clickhouse/task_queue/ddl</path>
</distributed_ddl>
<logger>
<level>trace</level>
<log>/var/log/clickhouse-server/clickhouse-server.log</log>
<errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
<size>1000M</size>
<count>10</count>
</logger>
</yandex>
clickhouse-metrica(metrica.xml
)
<yandex>
<zookeeper-servers>
<node index="clickhouse">
<host>zookeeper-0.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local</host>
<port>2181</port>
</node>
<node index="clickhouse">
<host>zookeeper-1.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local</host>
<port>2181</port>
</node>
<node index="clickhouse">
<host>zookeeper-2.zookeeper-headless.cloud-zookeeper-paas.svc.cluster.local</host>
<port>2181</port>
</node>
<session_timeout_ms>30000</session_timeout_ms>
<operation_timeout_ms>10000</operation_timeout_ms>
<root></root>
<identity></identity>
</zookeeper-servers>
<clickhouse_remote_servers>
<clickhouse>
<shard>
<replica>
<internal_replication>true</internal_replication>
<host>clickhouse-0.clickhouse-headless.cloud-clickhouse-paas.svc.cluster.local</host>
<port>9000</port>
<user>default</user>
<compression>true</compression>
</replica>
</shard>
<shard>
<replica>
<internal_replication>true</internal_replication>
<host>clickhouse-1.clickhouse-headless.cloud-clickhouse-paas.svc.cluster.local</host>
<port>9000</port>
<user>default</user>
<compression>true</compression>
</replica>
</shard>
<shard>
<replica>
<internal_replication>true</internal_replication>
<host>clickhouse-2.clickhouse-headless.cloud-clickhouse-paas.svc.cluster.local</host>
<port>9000</port>
<user>default</user>
<compression>true</compression>
</replica>
</shard>
</clickhouse>
</clickhouse_remote_servers>
<macros>
<replica from_env="HOSTNAME"></replica>
<shard from_env="SHARD"></shard>
</macros>
</yandex>
clickhouse-users(users.xml
)
<yandex>
</yandex>
Sentry Helm Charts 定製
接入 ClickHouse PaaS, 單叢集多節點
我們簡單修改 values.yml
禁用 sentry-charts 中的 clickHouse & zookeeper
clickhouse:
enabled: false
zookeeper:
enabled: false
修改 externalClickhouse
externalClickhouse:
database: default
host: "clickhouse.cloud-clickhouse-paas.svc.cluster.local"
httpPort: 8123
password: ""
singleNode: false
clusterName: "clickhouse"
tcpPort: 9000
username: default
注意:
-
這裡只是簡單的叢集內部接入 1 個多節點分片叢集,而 Snuba 系統的設計是允許你接入多個 ClickHouse 多節點多分片多副本叢集,將多個 Schema 分散到不同的叢集,從而實現超大規模吞吐。因為是同一個叢集的不同 namespace 的內部訪問,所以這裡簡單填入型別為 ClusterIP Sevice 即可。
-
注意這裡
singleNode
要設定成false
。因為我們是多節點,同時我們需要提供clusterName
:原始碼分析:
這將用於確定:
- 將執行哪些遷移(僅本地或本地和分散式表)
- 查詢中的差異 - 例如是否選擇了 _local 或 _dist 表
以及確定來使用不同的 ClickHouse Table Engines 等。
當然,ClickHouse 本身是一個單獨的技術方向,這裡就不展開討論了。
部署
helm install sentry ./sentry -f values.yaml -n sentry
驗證 _local 與 _dist 表以及 system.zookeeper
kubectl -n cloud-clickhouse-paas exec -it clickhouse-0 -- clickhouse-client --multiline --host="clickhouse-1.clickhouse-headless.cloud-clickhouse-paas"
show databases;
show tables;
select * from system.zookeeper where path = '/clickhouse';
高階部分 & 超大規模吞吐
接入 ClickHouse 多叢集/多節點/多分片/多副本的中介軟體 PaaS
獨立部署多套 VKE LoadBlancer+ VKE K8S Cluster + ZooKeeper-Operator + ClickHouse-Operator,分散 Schema 到不同的叢集以及多節點分片。
分析 Snuba 系統設計
檢視測試用例原始碼,瞭解系統設計與高階配置
關於針對 ClickHouse 叢集各個分片、副本之間的讀寫負載均衡、連線池等問題。Snuba 在系統設計、程式碼層面部分就已經做了充分的考慮以及最佳化。
關於 ClickHouse Operator 獨立的多個雲原生編排叢集以及 Snuba 系統設計等高階部分會在 VIP 專欄直播課單獨講解。
更多
- 公眾號:駭客下午茶,直播分享通知
- 雲原生中介軟體 PaaS 實踐:https://k8s-paas.hacker-linner.com