如何使用kafka增加topic的備份數量,讓業務更上一層樓

qcloud發表於2019-01-08

本文由雲+社群發表

一、困難點

建立topic的時候,可以通過指定引數 --replication-factor 設定備份數量。但是,一旦完成建立topic,則無法通過kafka-topic.sh 或者 命令修改replica數量。

二、解決辦法

​ 實際上,我們可以考慮一種 “另類” 的辦法:可以利用 kafka-reassign-partitions.sh 命令對所有分割槽進行重新分佈,在做分割槽重新分佈的時候,通過增加每個分割槽的replica備份數量來達到目的。

​ 本文將介紹如何利用 kafka-reassign-partitions.sh 命令增加topic的備份數量。

注意:以下命令使用到的topic名稱、zookeeper的ip和port,需要讀者替換成為實際叢集的引數。

(假設kafka叢集有4個broker,id分別為:1001,1002,1003,1004)

2.1、獲取當前topic的所有分割槽分佈在broker的情況

[root@tbds bin]# ./kafka-topics.sh --zookeeper 172.16.32.13:2181 --topic ranger_audits --describe
Topic:ranger_audits     PartitionCount:10       ReplicationFactor:1     Configs:
        Topic: ranger_audits    Partition: 0    Leader: 1001    Replicas: 1001  Isr: 1001
        Topic: ranger_audits    Partition: 1    Leader: 1002    Replicas: 1002  Isr: 1002
        Topic: ranger_audits    Partition: 2    Leader: 1001    Replicas: 1001  Isr: 1001
        Topic: ranger_audits    Partition: 3    Leader: 1002    Replicas: 1002  Isr: 1002
        Topic: ranger_audits    Partition: 4    Leader: 1001    Replicas: 1001  Isr: 1001
        Topic: ranger_audits    Partition: 5    Leader: 1002    Replicas: 1002  Isr: 1002
        Topic: ranger_audits    Partition: 6    Leader: 1001    Replicas: 1001  Isr: 1001
        Topic: ranger_audits    Partition: 7    Leader: 1002    Replicas: 1002  Isr: 1002
        Topic: ranger_audits    Partition: 8    Leader: 1001    Replicas: 1001  Isr: 1001
        Topic: ranger_audits    Partition: 9    Leader: 1002    Replicas: 1002  Isr: 1002

可以看出,ranger_audits 這個topic有10個分割槽,每個分割槽只有一個feplica備份,分佈在1001和1002兩臺broker上面。

下面我們需要將ranger_audits 的每個分割槽資料都增加到2個replica備份,且分佈到4個broker上面。

2.2、建立增加replica備份數量的配置檔案

(注意:儘量保持topic的原有每個分割槽的主備份不變化。因此,配置檔案的每個分割槽的第一個broker保持不變。)

[root@tbds bin]# vim ../config/increase-replication-factor.json
{"version":1,
"partitions":[
{"topic":"ranger_audits","partition":0,"replicas":[1001,1003]},
{"topic":"ranger_audits","partition":1,"replicas":[1002,1004]},
{"topic":"ranger_audits","partition":2,"replicas":[1001,1003]},
{"topic":"ranger_audits","partition":3,"replicas":[1002,1004]},
{"topic":"ranger_audits","partition":4,"replicas":[1001,1003]},
{"topic":"ranger_audits","partition":5,"replicas":[1002,1004]},
{"topic":"ranger_audits","partition":6,"replicas":[1001,1003]},
{"topic":"ranger_audits","partition":7,"replicas":[1002,1004]},
{"topic":"ranger_audits","partition":8,"replicas":[1001,1003]},
{"topic":"ranger_audits","partition":9,"replicas":[1002,1004]}
]}

上面的配置檔案說明,我們將topic的每個分割槽都增加了一個replica,且保持每個分割槽原有的主備份所在broker不變化,將每個分割槽新增的replica備份資料放到到1003和1004兩個broker上面。

2.3、開始執行增加分割槽

[root@tbds bin]# ./kafka-reassign-partitions.sh -zookeeper 172.16.32.13:2181 --reassignment-json-file ../config/increase-replication-factor.json --execute
Current partition replica assignment
{"version":1,"partitions":[{"topic":"ranger_audits","partition":3,"replicas":[1002]},{"topic":"ranger_audits","partition":9,"replicas":[1002]},{"topic":"ranger_audits","partition":8,"replicas":[1001]},{"topic":"ranger_audits","partition":1,"replicas":[1002]},{"topic":"ranger_audits","partition":4,"replicas":[1001]},{"topic":"ranger_audits","partition":2,"replicas":[1001]},{"topic":"ranger_audits","partition":5,"replicas":[1002]},{"topic":"ranger_audits","partition":0,"replicas":[1001]},{"topic":"ranger_audits","partition":6,"replicas":[1001]},{"topic":"ranger_audits","partition":7,"replicas":[1002]}]}

Save this to use as the --reassignment-json-file option during rollback
Successfully started reassignment of partitions 
{"version":1,"partitions":[{"topic":"ranger_audits","partition":0,"replicas":[1001,1003]},{"topic":"ranger_audits","partition":8,"replicas":[1001,1003]},{"topic":"ranger_audits","partition":5,"replicas":[1002,1004]},{"topic":"ranger_audits","partition":2,"replicas":[1001,1003]},{"topic":"ranger_audits","partition":9,"replicas":[1002,1004]},{"topic":"ranger_audits","partition":1,"replicas":[1002,1004]},{"topic":"ranger_audits","partition":3,"replicas":[1002,1004]},{"topic":"ranger_audits","partition":4,"replicas":[1001,1003]},{"topic":"ranger_audits","partition":7,"replicas":[1002,1004]},{"topic":"ranger_audits","partition":6,"replicas":[1001,1003]}]}

2.4、檢視執行進度

[root@tbds bin]# ./kafka-reassign-partitions.sh -zookeeper 172.16.32.13:2181 --reassignment-json-file ../config/increase-replication-factor.json --verify
Status of partition reassignment:
Reassignment of partition [ranger_audits,0] completed successfully
Reassignment of partition [ranger_audits,8] completed successfully
Reassignment of partition [ranger_audits,5] completed successfully
Reassignment of partition [ranger_audits,2] completed successfully
Reassignment of partition [ranger_audits,9] completed successfully
Reassignment of partition [ranger_audits,1] completed successfully
Reassignment of partition [ranger_audits,3] completed successfully
Reassignment of partition [ranger_audits,4] completed successfully
Reassignment of partition [ranger_audits,7] completed successfully
Reassignment of partition [ranger_audits,6] completed successfully

上面顯示增加分割槽操作成功

2.5、再次檢視topic的情況

[root@tbds bin]# ./kafka-topics.sh --zookeeper 172.16.32.13:2181 --topic ranger_audits --describe
Topic:ranger_audits     PartitionCount:10       ReplicationFactor:2     Configs:
        Topic: ranger_audits    Partition: 0    Leader: 1001    Replicas: 1001,1003     Isr: 1001,1003
        Topic: ranger_audits    Partition: 1    Leader: 1002    Replicas: 1002,1004     Isr: 1002,1004
        Topic: ranger_audits    Partition: 2    Leader: 1001    Replicas: 1001,1003     Isr: 1001,1003
        Topic: ranger_audits    Partition: 3    Leader: 1002    Replicas: 1002,1004     Isr: 1002,1004
        Topic: ranger_audits    Partition: 4    Leader: 1001    Replicas: 1001,1003     Isr: 1001,1003
        Topic: ranger_audits    Partition: 5    Leader: 1002    Replicas: 1002,1004     Isr: 1002,1004
        Topic: ranger_audits    Partition: 6    Leader: 1001    Replicas: 1001,1003     Isr: 1001,1003
        Topic: ranger_audits    Partition: 7    Leader: 1002    Replicas: 1002,1004     Isr: 1002,1004
        Topic: ranger_audits    Partition: 8    Leader: 1001    Replicas: 1001,1003     Isr: 1001,1003
        Topic: ranger_audits    Partition: 9    Leader: 1002    Replicas: 1002,1004     Isr: 1002,1004

從上面可以看出,備份數量增加成功

三、進一步思考

​ 利用上述介紹的辦法,除了可以用來增加topic的備份數量之外,還能夠處理以下幾個場景:

1、對topic的所有分割槽資料進行整體遷移。怎麼理解呢?假如叢集有N個broker,後來新擴容M個broker。由於新擴容的broker磁碟都是空的,原有的broker磁碟佔用都很滿。那麼我們可以利用上述方法,將儲存在原有N個broker的某些topic整體搬遷到新擴容的M個broker,進而實現kafka叢集的整體資料均衡。

​ 具體使用方法就是:通過編寫2.2章節的配置檔案,將topic的所有分割槽都配置到新的M個broker上面去,再執行excute,即可完成topic的所有分割槽資料整體遷移到新擴容的M個broker節點。

*2、broker壞掉的情況。*導致某些topic的某些分割槽的replica數量減少,可以利用kafka-reassign-partitions.sh增加replica;

*3、kafka 某些broker磁碟佔用很滿,某些磁碟佔用又很少。*可以利用kafka-reassign-partitions.sh遷移某些topic的分割槽資料到磁碟佔用少的broker,實現資料均衡;

*4、kafka叢集擴容。*需要把原來broker的topic資料整體遷移到新的broker,合理利用新擴容的broker,實現負載均衡。

此文已由作者授權騰訊雲+社群在各渠道釋出

獲取更多新鮮技術乾貨,可以關注我們騰訊雲技術社群-雲加社群官方號及知乎機構號

相關文章