首先吐槽下國內這些論壇的技術精神,不是我崇洋媚外,有些復讀機煩不煩啊,別人的東西吃進去吐出來好玩麼?
還有一些不懂裝懂,這種最可惡,明明自己都不明白自己在寫什麼,還是往精華區發,簡直離譜,知道自己多掙的積分會給新手帶來多大的負擔麼?
這幾天的感覺下來,kubernetes感覺並不算是很難的東西,只是因為缺少一個系統性的教程,某些培訓班的教程也早已跟不上時代的步伐了,另外技術共享也不能說盡如人意,所以造成了較高的行業壁壘。
所以我就來整理一下吧,如果能幫上後來人的話,那將是莫大的欣慰
首先我們先看couchdb
couchdb的yaml 檔案這樣寫,我已經建好並上傳至docker hub了一個開放埠和初始使用者密碼皆為root的映象,直接呼叫就好
tiany7/couchdb_real
然後全套檔案這樣寫:我把容器的5984對映到了主機的30005,對外訪問使用30005這個埠
---
# This is a declarative approach to describe a Kubernetes based
# deployment of an application. Since this is YAML, the indentation
# is very important
apiVersion: apps/v1
kind: Deployment # We are testing the Deployment resource
metadata:
name: couch-deployment # A name must be given to the deployment in the metadata section
spec: # This is the specification where we can even put the number of replicas
replicas: 1 # Say we want to run 3 replicas of nginx
selector:
matchLabels:
app: couch-server-app # 這個要前後一致,相當於主鍵索引找到這個service/deployment
minReadySeconds: 5 # if anything crashes before 5 secs, the deployment is not
# considered as ready and available. Default value is 0
template: # Specified info needed to run the pod and what runs in the pod
metadata:
labels:
app: couch-server-app # some label to give to this pod (see the matching label above)
spec: # actual specification
containers:
- name: couchdb # Used by DNS
image: tiany7/couchdb_real # this is the image name on hub.docker or if you have one locally. We use an older
# and then show how we can upgrade to newer version
imagePullPolicy: IfNotPresent # This forces the node to pull the image
ports: # Since nginx is a web server, we let it listen on port 80 in the container
- containerPort: 5984
---
apiVersion: v1
kind: Service
metadata:
name: couch-server-app
spec:
type: NodePort # by using NodePort, the service is published to outside world.
ports:
- protocol: TCP # this is default (so not needed either)
port: 5984 # the port used by the server inside the pod/container
nodePort: 30005 # this is what will be externally published
selector:
app: couch-server-app
---
apiVersion: v1
kind: Pod
metadata:
name: couch-service
spec:
containers:
- name: couch-service
image: tiany7/couchdb_real #這個是我自己的映象,功能之前說過了,大家不介意的話可以直接呼叫
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5984
hostPort: 30005
...
然後如果訪問埠(curl)出現no host to route的話記得是不是在service那裡寫了selector,一定記得!
之前記得一定要ufw allow <埠>你需要的埠
然後apply一下
sudo kubectl apply -f couch-deployment.yml
應該就能看到資料庫在執行了
開啟自己的公共ip:30005/utils/#
這不就看到了麼
然後我們來看zookeeper的配置
這個是zookeeper的yaml file,昨天跟著部落格園配置了好幾版,是錯的,這個囊括了service和deployment,我的兩臺機器public ip是128.114.xx.xx,到時候換成你們的就行
apiVersion: apps/v1
kind: Deployment
metadata:
name: zookeeper-deploy
spec:
selector:
matchLabels:
app: zookeeper-1
replicas: 1
template:
metadata:
labels:
app: zookeeper-1
spec:
hostname: zookeeper-1
nodeSelector: # this shows how you force k8s to schedule the pod on a specified node
kubernetes.io/hostname: vm2vv-2 #自己的vm節點名稱,用kubectl get nodes可以檢視23333
containers:
- name: zoo1
image: digitalwonderland/zookeeper
imagePullPolicy: IfNotPresent
ports:
- containerPort: 2181
env:
- name: ZOOKEEPER_ID
value: "1"
- name: ZOOKEEPER_SERVER_1
value: zoo1
---
apiVersion: v1
kind: Service
metadata:
name: zoo1
labels:
app: zookeeper-1
spec:
selector:
app: zookeeper-1 # used to match the pod(s) that run the actual matinv server
ports:
- protocol: TCP # this is default (so not needed either)
name: client
port: 2181 # the port used by the server inside the pod/container
- protocol: TCP # this is default (so not needed either)
name: follower
port: 2888 # the port used by the server inside the pod/container
- protocol: TCP # this is default (so not needed either)
port: 3888 # the port used by the server inside the pod/container
name: leader
執行命令
sudo kubectl apply -f zookeeper_setup.yml
然後是kafka的配置,記得,kafka需要zookeeper作為基礎建設,所以先等zookeeper跑起來之後才能啟動kafka
kafka的yaml配置:
ind: Deployment
apiVersion: apps/v1
metadata:
name: kafka-broker0
spec:
replicas: 1
selector:
matchLabels:
app: kafka
id: "0"
template:
metadata:
labels:
app: kafka
id: "0"
spec:
containers:
- name: kafka
image: wurstmeister/kafka
ports:
- containerPort: 9092
env:
- name: KAFKA_ADVERTISED_PORT
value: "30001"
- name: KAFKA_ADVERTISED_HOST_NAME
value: 129.114.25.68
- name: KAFKA_ZOOKEEPER_CONNECT
value: zookeeper-1:2181
- name: KAFKA_BROKER_ID
value: "0"
- name: KAFKA_CREATE_TOPICS
value: utilization:1:1#自己topic的名字,記得改啊
---
apiVersion: v1
kind: Service
metadata:
name: kafka-service
labels:
name: kafka
spec:
ports:
- port: 9092
name: kafka-port
protocol: TCP
nodePort: 30001
selector:
app: kafka
id: "0"
type: NodePort
記得我的配置除了公共ip和埠都可以拿去通用
然後執行
sudo kubectl apply -f kafka_setup.yml
這個yml可以自己定義名字和yaml/yml字尾
然後我們執行完之後就能從本地收發資訊了
可以看到執行完之後是有一對kafka/zookeeper在running的,另一個是因為我還沒有給他配置zookeeper,所以no broker available,replica的時候我應該選1的,no matter
然後我們建立了producer和consumer,用python寫得,這裡只是簡單得producer和consumer,大家看一下就成
這個往第三臺機子dump資料的程式碼已經寫好了,是用157結尾的進行dump
producer:
import os # need this for popen import time # for sleep from kafka import KafkaProducer # producer of events # We can make this more sophisticated/elegant but for now it is just # hardcoded to the setup I have on my local VMs # acquire the producer # (you will need to change this to your bootstrap server's IP addr) producer = KafkaProducer (bootstrap_servers="129.114.25.68:30001", acks=1) # wait for leader to write to log for i in range (100): # get the output of the top command process = os.popen ("top -n 1 -b") print("here from Yuanhan") # read the contents that we wish to send as topic content contents = process.read () # send the contents under topic utilizations. Note that it expects # the contents in bytes so we convert it to bytes. # # Note that here I am not serializing the contents into JSON or anything # as such but just taking the output as received and sending it as bytes # You will need to modify it to send a JSON structure, say something # like <timestamp, contents of top> # producer.send ("utilization", value=bytes (contents, 'ascii')) producer.flush () # try to empty the sending buffer # sleep a second time.sleep (1) # we are done producer.close ()
然後是consumer
import os # need this for popen import time # for sleep from kafka import KafkaConsumer # consumer of events import couchdb # We can make this more sophisticated/elegant but for now it is just # hardcoded to the setup I have on my local VMs # acquire the consumer # (you will need to change this to your bootstrap server's IP addr) consumer = KafkaConsumer (bootstrap_servers="129.114.25.68:30001") # subscribe to topic consumer.subscribe (topics=["utilization"]) user = "root" password = "root" couchserver = couchdb.Server("http://%s:%s@129.114.27.157:30005" % (user, password)) #db = couchserver['mydb'] db = couchserver.create('newDB') # we keep reading and printing it = 0 for msg in consumer: # what we get is a record. From this record, we are interested in printing # the contents of the value field. We are sure that we get only the # utilizations topic because that is the only topic we subscribed to. # Otherwise we will need to demultiplex the incoming data according to the # topic coming in. # # convert the value field into string (ASCII) # # Note that I am not showing code to obtain the incoming data as JSON # nor am I showing any code to connect to a backend database sink to # dump the incoming data. You will have to do that for the assignment. doc = {'foo': str(msg.value, 'ascii')} db.save(doc) if it > 10: break it += 1 print(str(msg.value, 'ascii')) # we are done. As such, we are not going to get here as the above loop # is a forever loop. consumer.close ()
好了,之後我們從本地的kafka那裡執行
sudo bin/kafka-topics.sh --describe --topic utilization --bootstrap-server 129.114.25.68:30001
這不就在在這麼?
然後在kafka所在pod機器啟動consumer.py
噹噹噹當!
我們可以在資料看見資料了!
這個作業花了兩週時間才想明白怎麼搞,因為資料極度匱乏,所以自己只能憑藉自己的僅有的印象在這xjb摸索,當時想的是,要是這些東西有現成的程式碼資料可以參考就好了,但很不幸,只有殘缺不全的資料,不過最後也算是摸出來了,真的,我感覺在所有知識傳播的路上仍然任重道遠
自己學會一門知識遠沒有讓這門知識服務於更多的人有價值,再接再厲,我有ansible playbook自動化的版本,可以私信我發你
終於可以搞cf了,我要在退役前上紫!