Kubernetes 學習筆記-- kafka往couchdb裡倒東西

tiany7發表於2021-04-19

原文網址 : https://www.cnblogs.com/tiany7/p/14677755.html

筆記Kafka

首先吐槽下國內這些論壇的技術精神，不是我崇洋媚外，有些復讀機煩不煩啊，別人的東西吃進去吐出來好玩麼？

還有一些不懂裝懂，這種最可惡，明明自己都不明白自己在寫什麼，還是往精華區發，簡直離譜，知道自己多掙的積分會給新手帶來多大的負擔麼？

這幾天的感覺下來，kubernetes感覺並不算是很難的東西，只是因為缺少一個系統性的教程，某些培訓班的教程也早已跟不上時代的步伐了，另外技術共享也不能說盡如人意，所以造成了較高的行業壁壘。

所以我就來整理一下吧，如果能幫上後來人的話，那將是莫大的欣慰

首先我們先看couchdb

couchdb的yaml 檔案這樣寫，我已經建好並上傳至docker hub了一個開放埠和初始使用者密碼皆為root的映象，直接呼叫就好

tiany7/couchdb_real

然後全套檔案這樣寫：我把容器的5984對映到了主機的30005，對外訪問使用30005這個埠

---
# This is a declarative approach to describe a Kubernetes based
# deployment of an application.  Since this is YAML, the indentation
# is very important
apiVersion: apps/v1
kind: Deployment         # We are testing the Deployment resource
metadata:
  name: couch-deployment  # A name must be given to the deployment in the metadata section
spec:                     # This is the specification where we can even put the number of replicas
  replicas: 1             # Say we want to run 3 replicas of nginx
  selector:
    matchLabels:
      app: couch-server-app          # 這個要前後一致，相當於主鍵索引找到這個service/deployment
  minReadySeconds: 5  # if anything crashes before 5 secs, the deployment is not
                          # considered as ready and available. Default value is 0
  template:               # Specified info needed to run the pod and what runs in the pod
    metadata:
      labels:
        app: couch-server-app        # some label to give to this pod (see the matching label above)
    spec:                 # actual specification
      containers:
      - name: couchdb       # Used by DNS
        image: tiany7/couchdb_real  # this is the image name on hub.docker or if you have one locally. We use an older
                          # and then show how we can upgrade to newer version
        imagePullPolicy: IfNotPresent  # This forces the node to pull the image
        ports:            # Since nginx is a web server, we let it listen on port 80 in the container
        - containerPort: 5984

---

apiVersion: v1
kind: Service
metadata:
  name: couch-server-app
spec:
  type: NodePort   # by using NodePort, the service is published to outside world.
  ports:
    - protocol: TCP     # this is default (so not needed either)
      port: 5984  # the port used by the server inside the pod/container
      nodePort: 30005 # this is what will be externally published
  selector:
    app: couch-server-app

---

apiVersion: v1
kind: Pod
metadata:
  name: couch-service
spec:
  containers:
  - name: couch-service
    image: tiany7/couchdb_real #這個是我自己的映象，功能之前說過了，大家不介意的話可以直接呼叫
    imagePullPolicy: IfNotPresent
    ports:
      - containerPort: 5984
        hostPort: 30005

...

然後如果訪問埠（curl）出現no host to route的話記得是不是在service那裡寫了selector，一定記得！

之前記得一定要ufw allow <埠>你需要的埠

然後apply一下

sudo kubectl apply -f couch-deployment.yml

應該就能看到資料庫在執行了

開啟自己的公共ip:30005/utils/#

這不就看到了麼

然後我們來看zookeeper的配置

這個是zookeeper的yaml file，昨天跟著部落格園配置了好幾版，是錯的，這個囊括了service和deployment，我的兩臺機器public ip是128.114.xx.xx，到時候換成你們的就行

apiVersion: apps/v1
kind: Deployment
metadata:
  name: zookeeper-deploy
spec:
  selector:
    matchLabels:
      app: zookeeper-1
  replicas: 1
  template:
    metadata:
      labels:
        app: zookeeper-1
    spec:
      hostname: zookeeper-1
      nodeSelector: # this shows how you force k8s to schedule the pod on a specified node
        kubernetes.io/hostname: vm2vv-2 #自己的vm節點名稱，用kubectl get nodes可以檢視23333
      containers:
      - name: zoo1
        image: digitalwonderland/zookeeper
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 2181
        env:
        - name: ZOOKEEPER_ID
          value: "1"
        - name: ZOOKEEPER_SERVER_1
          value: zoo1

---

apiVersion: v1
kind: Service
metadata:
  name: zoo1
  labels:
    app: zookeeper-1
spec:
  selector:
    app: zookeeper-1   # used to match the pod(s) that run the actual matinv server
  ports:
    - protocol: TCP     # this is default (so not needed either)
      name: client
      port: 2181  # the port used by the server inside the pod/container
    - protocol: TCP     # this is default (so not needed either)
      name: follower
      port: 2888  # the port used by the server inside the pod/container
    - protocol: TCP     # this is default (so not needed either)
      port: 3888  # the port used by the server inside the pod/container
      name: leader

執行命令

sudo kubectl apply -f zookeeper_setup.yml

然後是kafka的配置，記得，kafka需要zookeeper作為基礎建設，所以先等zookeeper跑起來之後才能啟動kafka

kafka的yaml配置：

ind: Deployment
apiVersion: apps/v1
metadata:
  name: kafka-broker0
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kafka
      id: "0"
  template:
    metadata:
      labels:
        app: kafka
        id: "0"
    spec:
      containers:
      - name: kafka
        image: wurstmeister/kafka
        ports:
        - containerPort: 9092
        env:
        - name: KAFKA_ADVERTISED_PORT
          value: "30001"
        - name: KAFKA_ADVERTISED_HOST_NAME
          value: 129.114.25.68
        - name: KAFKA_ZOOKEEPER_CONNECT
          value: zookeeper-1:2181
        - name: KAFKA_BROKER_ID
          value: "0"
        - name: KAFKA_CREATE_TOPICS
          value: utilization:1:1#自己topic的名字，記得改啊
---

apiVersion: v1
kind: Service
metadata:
  name: kafka-service
  labels:
    name: kafka
spec:
  ports:
  - port: 9092
    name: kafka-port
    protocol: TCP
    nodePort: 30001
  selector:
    app: kafka
    id: "0"
  type: NodePort

記得我的配置除了公共ip和埠都可以拿去通用

然後執行

sudo kubectl apply -f  kafka_setup.yml

這個yml可以自己定義名字和yaml/yml字尾

然後我們執行完之後就能從本地收發資訊了

可以看到執行完之後是有一對kafka/zookeeper在running的，另一個是因為我還沒有給他配置zookeeper，所以no broker available，replica的時候我應該選1的，no matter

然後我們建立了producer和consumer，用python寫得，這裡只是簡單得producer和consumer，大家看一下就成

這個往第三臺機子dump資料的程式碼已經寫好了，是用157結尾的進行dump

producer：

import os   # need this for popen
import time # for sleep
from kafka import KafkaProducer  # producer of events

# We can make this more sophisticated/elegant but for now it is just
# hardcoded to the setup I have on my local VMs

# acquire the producer
# (you will need to change this to your bootstrap server's IP addr)
producer = KafkaProducer (bootstrap_servers="129.114.25.68:30001", acks=1)  # wait for leader to write to log

for i in range (100):
    
    # get the output of the top command
    process = os.popen ("top -n 1 -b")
    print("here from Yuanhan")
    # read the contents that we wish to send as topic content
    contents = process.read ()

    # send the contents under topic utilizations. Note that it expects
    # the contents in bytes so we convert it to bytes.
    #
    # Note that here I am not serializing the contents into JSON or anything
    # as such but just taking the output as received and sending it as bytes
    # You will need to modify it to send a JSON structure, say something
    # like <timestamp, contents of top>
    #
    producer.send ("utilization", value=bytes (contents, 'ascii'))
    producer.flush ()   # try to empty the sending buffer

    # sleep a second
    time.sleep (1)

# we are done
producer.close ()

然後是consumer

import os   # need this for popen
import time # for sleep
from kafka import KafkaConsumer  # consumer of events
import couchdb
# We can make this more sophisticated/elegant but for now it is just
# hardcoded to the setup I have on my local VMs

# acquire the consumer
# (you will need to change this to your bootstrap server's IP addr)
consumer = KafkaConsumer (bootstrap_servers="129.114.25.68:30001")

# subscribe to topic
consumer.subscribe (topics=["utilization"])
user = "root"
password = "root"
couchserver = couchdb.Server("http://%s:%s@129.114.27.157:30005" % (user, password))
#db = couchserver['mydb']
db = couchserver.create('newDB')
# we keep reading and printing
it = 0
for msg in consumer:
    # what we get is a record. From this record, we are interested in printing
    # the contents of the value field. We are sure that we get only the
    # utilizations topic because that is the only topic we subscribed to.
    # Otherwise we will need to demultiplex the incoming data according to the
    # topic coming in.
    #
    # convert the value field into string (ASCII)
    #
    # Note that I am not showing code to obtain the incoming data as JSON
    # nor am I showing any code to connect to a backend database sink to
    # dump the incoming data. You will have to do that for the assignment.
    doc = {'foo': str(msg.value, 'ascii')}
    db.save(doc)
    if it > 10:
        break
    it += 1
    print(str(msg.value, 'ascii'))

# we are done. As such, we are not going to get here as the above loop
# is a forever loop.
consumer.close ()

好了，之後我們從本地的kafka那裡執行

sudo bin/kafka-topics.sh --describe --topic utilization --bootstrap-server 129.114.25.68:30001

這不就在在這麼？

然後在kafka所在pod機器啟動consumer.py

噹噹噹當！

我們可以在資料看見資料了！

這個作業花了兩週時間才想明白怎麼搞，因為資料極度匱乏，所以自己只能憑藉自己的僅有的印象在這xjb摸索，當時想的是，要是這些東西有現成的程式碼資料可以參考就好了，但很不幸，只有殘缺不全的資料，不過最後也算是摸出來了，真的，我感覺在所有知識傳播的路上仍然任重道遠

自己學會一門知識遠沒有讓這門知識服務於更多的人有價值，再接再厲，我有ansible playbook自動化的版本，可以私信我發你

終於可以搞cf了，我要在退役前上紫！

Kafka 學習筆記
2022-02-07
Kafka筆記
Kafka 學習筆記（二）：初探 Kafka
2019-03-04
Kafka筆記
Kafka學習筆記（二）：初探Kafka
2018-03-26
Kafka筆記
kafka學習筆記（一）
2020-11-23
Kafka筆記
Kubernetes scheduler學習筆記
2019-08-14
筆記
【kafka學習筆記】kafka的基本概念
2021-12-12
Kafka筆記
Kubernetes學習筆記（五）：卷
2020-05-24
筆記
Kubernetes學習筆記（更新中。。。。）
2020-11-24
筆記
Kafka 學習筆記（一）：為什麼需要 Kafka？
2018-03-14
Kafka筆記
Kafka學習筆記（一）：為什麼需要Kafka？
2018-03-25
Kafka筆記
Kubernetes學習筆記（四）：服務
2020-05-23
筆記
【一】kubernetes學習筆記-Pod概念
2021-05-13
筆記
依賴倒轉原則--學習筆記
2020-11-27
筆記
【Kubernetes學習筆記】-kubeadm 手動搭建kubernetes 叢集
2020-12-08
筆記
史上最全、最詳細的 kafka 學習筆記！
2018-11-27
Kafka筆記
訊息中介軟體-kafka學習筆記一
2020-11-02
Kafka筆記
深度學習記錄（1）metricLogger是個什麼東西？
2020-11-08
深度學習
Kubernetes學習筆記（二）：Pod、標籤、註解
2020-05-20
筆記
近來學習的一點東西
2020-10-16
Kubernetes學習筆記（七）：訪問Pod後設資料與Kubernetes API
2020-05-26
筆記API
Kafka超詳細學習筆記【概念理解，安裝配置】
2020-12-26
Kafka筆記
kafka 筆記
2019-01-09
Kafka筆記
Cesium筆記----關於viewer的配置及常用東西
2020-12-24
筆記View
kubernetes學習筆記（二）：k8s初體驗
2018-08-31
筆記K8S
numpy的學習筆記\pandas學習筆記
2018-03-18
筆記
過往業務筆記整理
2024-06-26
筆記
筆試不會的東西
2020-11-09
筆試
在阿里工作的日子裡，我都學到了哪些東西？
2019-11-01
阿里
坑：那些需要我重新學習/理解的東西
2024-08-07
《Kafka筆記》1、Kafka初識
2020-10-18
Kafka筆記
kubernetes學習筆記（三）：阿里雲遊戲業務實戰
2018-10-17
筆記阿里遊戲
Kubernetes（k8s）學習筆記（一）——系統架構
2018-12-15
K8S筆記架構
Kubernetes學習筆記（六）：使用ConfigMap和Secret配置應用程式
2020-05-25
筆記
【Kubernetes學習筆記】-服務訪問之 IP & Port & Endpoint 辨析
2020-11-22
筆記
技術分享 | Kubernetes 學習筆記之基礎知識篇
2021-10-19
筆記
學習筆記
2024-04-14
筆記
【Kubernetes學習筆記】-使用Minikube快速部署K8S單機學習環境
2020-11-09
筆記K8S
東拼西湊學java
2022-12-16
Java

Kubernetes 學習筆記-- kafka往couchdb裡倒東西

相關文章