快速安裝 kafka 叢集

程式設計玩家發表於2022-04-02

原文網址 : https://www.cnblogs.com/Erik_Xu/p/16089974.html

Kafka

前言

最近因為工作原因，需要安裝一個 kafka 叢集，目前網路上有很多相關的教程，按著步驟來也能完成安裝，只是這些教程都顯得略微繁瑣。因此，我寫了這篇文章幫助大家快速完成 kafka 叢集安裝。

安裝步驟

準備多臺伺服器，數量建議為奇數（如：3，5，7 等），作業系統為 CentOS 7+。

這裡使用 3 臺伺服器作為例子，IP 分別為 192.168.1.1、192.168.1.2、192.168.1.3，修改下述指令碼檔案的 IP 地址，並拷貝到 3 臺伺服器上分別執行即可完成安裝。

#!/bin/bash

# Modify the link if you want to download other version
KAFKA_DOWNLOAD_URL="https://dlcdn.apache.org/kafka/3.1.0/kafka_2.13-3.1.0.tgz"

# Please use your own server ip
SERVERS=("192.168.1.1" "192.168.1.2" "192.168.1.3")


ID=0

MECHINE_IP=$(hostname -i)
echo "Mechine IP: "${MECHINE_IP}

LENGTH=${#SERVERS[@]}

for (( i=0; i<${LENGTH}; i++ ));
do
    if [ "${SERVERS[$i]}" = "${MECHINE_IP}" ]; then
        ID=$((i+1))
    fi
done

echo "ID: "${ID}

if [ "${ID}" -eq "0" ]; then
  echo "Mechine IP is not matched to server list"
  exit 1
fi

ZOOKEEPER_CONNECT=$(printf ",%s:2181" "${SERVERS[@]}")
ZOOKEEPER_CONNECT=${ZOOKEEPER_CONNECT:1}
echo "Zookeeper Connect: "${ZOOKEEPER_CONNECT}


echo "---------- Update yum ----------"
yum update -y
yum install -y wget


echo "---------- Install java ----------"
yum -y install java-1.8.0-openjdk
java -version


echo "---------- Create kafka user & group ----------"
groupadd -r kafka
useradd -g kafka -r kafka -s /bin/false


echo "---------- Download kafka ----------"
cd /opt
wget ${KAFKA_DOWNLOAD_URL} -O kafka.tgz
mkdir -p kafka
tar -xzf kafka.tgz -C kafka --strip-components=1
chown -R kafka:kafka /opt/kafka


echo "---------- Install and start zookeeper ----------"
mkdir -p /data/zookeeper
chown -R kafka:kafka /data/zookeeper
echo "${ID}" > /data/zookeeper/myid


# zookeeper config
# https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#sc_configuration
cat <<EOF > /opt/kafka/config/zookeeper-cluster.properties
# the directory where the snapshot is stored.
dataDir=/data/zookeeper

# the port at which the clients will connect
clientPort=2181

# setting number of connections to unlimited
maxClientCnxns=0

# keeps a heartbeat of zookeeper in milliseconds
tickTime=2000

# time for initial synchronization
initLimit=10

# how many ticks can pass before timeout
syncLimit=5

# define servers ip and internal ports to zookeeper
EOF

for (( i=0; i<${LENGTH}; i++ ));
do
    INDEX=$((i+1))
    echo "server.${INDEX}=${SERVERS[$i]}:2888:3888" >> /opt/kafka/config/zookeeper-cluster.properties
done


# zookeeper.service
cat <<EOF > /usr/lib/systemd/system/zookeeper.service
[Unit]
Description=Apache Zookeeper server (Kafka)
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
Group=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper-cluster.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl start zookeeper && systemctl enable zookeeper


echo "---------- Install and start kafka ----------"
mkdir -p /data/kafka
chown -R kafka:kafka /data/kafka


# kafka config
# https://kafka.apache.org/documentation/#configuration
cat <<EOF > /opt/kafka/config/server-cluster.properties
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=${ID}

# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured.  Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
advertised.listeners=PLAINTEXT://${MECHINE_IP}:9092

# A comma separated list of directories under which to store log files
log.dirs=/data/kafka

# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1

# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

# The minimum age of a log file to be eligible for deletion due to age
log.retention.hours=168

# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=${ZOOKEEPER_CONNECT}/kafka

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=60000
EOF


# kafka.service
cat <<EOF > /usr/lib/systemd/system/kafka.service
[Unit]
Description=Apache Kafka server (broker)
Documentation=http://kafka.apache.org/documentation.html
Requires=network.target remote-fs.target
After=network.target remote-fs.target kafka-zookeeper.service
 
[Service]
Type=simple
User=kafka
Group=kafka
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server-cluster.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-failure
 
[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl start kafka && systemctl enable kafka

setup.sh

基本操作

# 啟動 zookeeper
systemctl start zookeeper

# 停止 zookeeper
systemctl stop zookeeper

# 重啟 zookeeper
systemctl restart zookeeper

# 檢視 zookeeper 日誌
systemctl status zookeeper -l

# 啟動 kafka
systemctl start kafka

# 停止 kafka
systemctl stop kafka

# 重啟 kafka
systemctl restart kafka

# 檢視 kafka 日誌
systemctl status kafka -l

簡單測試

# 進入 kafka bin 目錄
cd /opt/kafka/bin/

# 建立一個 topic
kafka-topics.sh --create --topic test --partitions 3 --replication-factor 1 --bootstrap-server localhost:9092

# 檢視 topic 描述
kafka-topics.sh --topic test --describe --bootstrap-server localhost:9092

# 啟動生產者然後輸入訊息
kafka-console-producer.sh --topic test --bootstrap-server localhost:9092

# 啟動消費者消費訊息
kafka-console-consumer.sh --topic test --from-beginning --bootstrap-server localhost:9092

# 刪除 topic
kafka-topics.sh --topic test --delete --bootstrap-server localhost:9092

指令碼說明

1. 以下程式碼主要指定下載 kafka 的版本以及伺服器 IP 列表，可根據實際情況進行調整。

# Modify the link if you want to download other version
KAFKA_DOWNLOAD_URL="https://dlcdn.apache.org/kafka/3.1.0/kafka_2.13-3.1.0.tgz"

# Please use your own server ip
SERVERS=("192.168.1.1" "192.168.1.2" "192.168.1.3")

2. 以下程式碼主要用於生成 zookeeper id 和 kafka broker id 以及拼接 kafka 配置中的 zookeeper 連線串，通過本機 IP 與填寫的 IP 列表進行匹配，如果本機 IP 等於第一個伺服器 IP，則 ID為 1，等於第二個伺服器 IP，則 ID為 2，等於第二個伺服器 IP，則 ID為 3，以此類推；本機 IP 不在填寫的 IP 列表中，則會退出安裝。

ID=0

MECHINE_IP=$(hostname -i)
echo "Mechine IP: "${MECHINE_IP}

LENGTH=${#SERVERS[@]}

for (( i=0; i<${LENGTH}; i++ ));
do
    if [ "${SERVERS[$i]}" = "${MECHINE_IP}" ]; then
        ID=$((i+1))
    fi
done

echo "ID: "${ID}

if [ "${ID}" -eq "0" ]; then
  echo "Mechine IP is not matched to server list"
  exit 1
fi

ZOOKEEPER_CONNECT=$(printf ",%s:2181" "${SERVERS[@]}")
ZOOKEEPER_CONNECT=${ZOOKEEPER_CONNECT:1}

3. 更新 yum 源，並安裝 wget 下載工具

yum update -y
yum install -y wget

4. 安裝 java 8

yum -y install java-1.8.0-openjdk
java -version

5. 建立 kafka 使用者及組

groupadd -r kafka
useradd -g kafka -r kafka -s /bin/false

6. 下載並解壓 kafka 可執行程式

cd /opt
wget ${KAFKA_DOWNLOAD_URL} -O kafka.tgz
mkdir -p kafka
tar -xzf kafka.tgz -C kafka --strip-components=1
chown -R kafka:kafka /opt/kafka

7. 建立 zookeeper 目錄，建立 zookeeper id

mkdir -p /data/zookeeper
chown -R kafka:kafka /data/zookeeper
echo "${ID}" > /data/zookeeper/myid

8. 生成 zookeeper 配置檔案，詳細說明可參考：https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#sc_configuration

cat <<EOF > /opt/kafka/config/zookeeper-cluster.properties
# the directory where the snapshot is stored.
dataDir=/data/zookeeper

# the port at which the clients will connect
clientPort=2181

# setting number of connections to unlimited
maxClientCnxns=0

# keeps a heartbeat of zookeeper in milliseconds
tickTime=2000

# time for initial synchronization
initLimit=10

# how many ticks can pass before timeout
syncLimit=5

# define servers ip and internal ports to zookeeper
EOF

for (( i=0; i<${LENGTH}; i++ ));
do
    INDEX=$((i+1))
    echo "server.${INDEX}=${SERVERS[$i]}:2888:3888" >> /opt/kafka/config/zookeeper-cluster.properties
done

9. 建立 zookeeper systemd 管理檔案，啟動並設定開機啟動 zookeeper

cat <<EOF > /usr/lib/systemd/system/zookeeper.service
[Unit]
Description=Apache Zookeeper server (Kafka)
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
Group=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper-cluster.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl start zookeeper && systemctl enable zookeeper

10. 建立 kafka 目錄

mkdir -p /data/kafka
chown -R kafka:kafka /data/kafka

11. 生成 kafka 配置檔案，詳細說明可參考：https://kafka.apache.org/documentation/#configuration

cat <<EOF > /opt/kafka/config/server-cluster.properties
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=${ID}

# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured. Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
advertised.listeners=PLAINTEXT://${MECHINE_IP}:9092

# A comma separated list of directories under which to store log files
log.dirs=/data/kafka

# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1

# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

# The minimum age of a log file to be eligible for deletion due to age
log.retention.hours=168

# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=${ZOOKEEPER_CONNECT}/kafka

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=60000
EOF

12. 建立 kafka systemd 管理檔案，啟動並設定開機啟動 kafka

cat <<EOF > /usr/lib/systemd/system/kafka.service
[Unit]
Description=Apache Kafka server (broker)
Documentation=http://kafka.apache.org/documentation.html
Requires=network.target remote-fs.target
After=network.target remote-fs.target kafka-zookeeper.service
 
[Service]
Type=simple
User=kafka
Group=kafka
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server-cluster.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-failure
 
[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl start kafka && systemctl enable kafka

總結

按照上述的操作，你將快速完成 kafka 叢集安裝，如有問題可以在文章留言。

安裝Kafka叢集
2018-04-08
Kafka
安裝Zookeeper和Kafka叢集
2023-04-17
Kafka
Zookeeper3.4.14（單叢集）、Kafka_2.12-2.2.2（叢集）安裝
2020-10-13
Kafka
用 Docker 快速搭建 Kafka 叢集
2020-06-23
DockerKafka
Zookeeper叢集 + Kafka叢集
2024-07-18
Kafka
安裝Consul叢集
2021-09-09
zookeeper叢集及kafka叢集搭建
2021-06-28
Kafka
Kafka叢集配置
2019-02-22
Kafka
kafka叢集搭建
2019-01-19
Kafka
完整安裝always on叢集
2019-03-25
初識kafka叢集
2018-04-30
Kafka
SpringBoot 和 Kafka 叢集
2020-01-03
Spring BootKafka
redis-3.0.6 安裝叢集
2019-02-25
Redis
redis 5.0 叢集的安裝
2019-03-01
Redis
kafka-2.11叢集搭建
2019-04-05
Kafka
Apache Kafka – 叢集架構
2024-03-27
ApacheKafka架構
06 . ELK Stack + kafka叢集
2020-07-24
Kafka
Ubuntu上kubeadm安裝Kubernetes叢集
2019-02-21
Ubuntu
Redis安裝+叢集+效能監控
2019-03-04
Redis
CDH安裝大資料叢集
2018-08-24
大資料
CentOS7 安裝PG叢集
2020-10-03
CentOS
Cloudera Manager安裝 & 搭建CDH叢集
2020-04-05
Cloud
安裝Greenplum 5.2 叢集實戰
2020-04-19
ARM架構安裝Kubernetes叢集
2020-12-09
架構
linux搭建kafka叢集，多master節點叢集說明
2022-04-06
LinuxKafkaAST
在Ubuntu 18.04.1上安裝Hadoop叢集
2018-08-17
UbuntuHadoop
Zookeeper-3.4.10 叢集的安裝配置
2018-07-13
Cassandra安裝及分散式叢集搭建
2018-08-31
分散式
centos安裝k8s叢集
2022-03-01
CentOSK8S
Ubuntu 安裝k8s叢集
2022-07-14
UbuntuK8S
快速搭建Jenkins叢集
2022-11-09
Jenkins
Mac 使用 docker 搭建 kafka 叢集 + Zookeeper + kafka-manager
2018-11-04
MacDockerKafka
KubeSphere 部署 Kafka 叢集實戰指南
2024-08-09
Kafka
Kubernetes安裝之三：etcd叢集的配置
2019-02-14
kubernetes叢集的安裝異常彙總
2018-10-11
Redis安裝之叢集-哨兵模式（sentinel）模式
2024-03-12
Redis模式
Linux原始碼安裝RabbitMQ高可用叢集
2022-03-28
Linux原始碼MQ
分散式 PostgreSQL 叢集(Citus)官方安裝指南
2022-03-20
分散式SQL

快速安裝 kafka 叢集

前言

安裝步驟

基本操作

簡單測試

指令碼說明

總結

相關文章