可以使用高階排程分為:
- 節點選擇器: nodeSelector、nodeName
- 節點親和性排程: nodeAffinity
- Pod親和性排程:PodAffinity
- Pod反親和性排程:podAntiAffinity
nodeSelector, nodeName
cd; mkdir schedule; cd schedule/
vi pod-demo.yaml
# 內容為
apiVersion: v1
kind: Pod
metadata:
name: pod-demo
labels:
app: myapp
tier: frontend
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
nodeSelector:
disktype: harddisk
kubectl apply -f pod-demo.yaml
kubectl get pods
kubectl describe pod pod-demo
# 執行結果:
Warning FailedScheduling 2m3s (x25 over 3m15s) default-scheduler 0/3 nodes are available: 3 node(s) didn`t match node selector.
# 打上標籤
kubectl label node node2 disktype=harddisk
# 正常啟動
kubectl get pods
nodeAffinity
requiredDuringSchedulingIgnoredDuringExecution 硬親和性 必須滿足親和性。
preferredDuringSchedulingIgnoredDuringExecution 軟親和性 能滿足最好,不滿足也沒關係。
硬親和性:
matchExpressions : 匹配表示式,這個標籤可以指定一段,例如pod中定義的key為zone,operator為In(包含那些),values為 foo和bar。就是在node節點中包含foo和bar的標籤中排程
matchFields : 匹配欄位 和上面的意思 不過他可以不定義標籤值,可以定義
# 選擇在 node 有 zone 標籤值為 foo 或 bar 值的節點上執行 pod
vi pod-nodeaffinity-demo.yaml
# 內容為
apiVersion: v1
kind: Pod
metadata:
name: pod-node-affinity-demo
labels:
app: myapp
tier: frontend
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: zone
operator: In
values:
- foo
- bar
kubectl apply -f pod-nodeaffinity-demo.yaml
kubectl describe pod pod-node-affinity-demo
# 執行結果:
Warning FailedScheduling 2s (x8 over 20s) default-scheduler 0/3 nodes are available: 3 node(s) didn`t match node selector.
# 給其中一個node打上foo的標籤
kubectl label node node1 zone=foo
# 正常啟動
kubectl get pods
軟親和性 :
cp pod-nodeaffinity-demo.yaml pod-nodeaffinity-demo-2.yaml
vi pod-nodeaffinity-demo-2.yaml
# 內容為
apiVersion: v1
kind: Pod
metadata:
name: pod-node-affinity-demo-2
labels:
app: myapp
tier: frontend
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: zone
operator: In
values:
- foo
- bar
weight: 60
kubectl apply -f pod-nodeaffinity-demo-2.yaml
podAffinity
Pod親和性場景,我們的k8s叢集的節點分佈在不同的區域或者不同的機房,當服務A和服務B要求部署在同一個區域或者同一機房的時候,我們就需要親和性排程了。
labelSelector : 選擇跟那組Pod親和
namespaces : 選擇哪個名稱空間
topologyKey : 指定節點上的哪個鍵
kubectl get pods
kubectl delete pod pod-node-affinity-demo pod-node-affinity-demo-2 pod-demo
cd ~/schedule/
vi pod-required-affinity-demo.yaml
# 內容為:
apiVersion: v1
kind: Pod
metadata:
name: pod-first
labels:
app: myapp
tier: frontend
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
---
apiVersion: v1
kind: Pod
metadata:
name: pod-second
labels:
app: db
tier: db
spec:
containers:
- name: busybox
image: busybox
imagePullPolicy: IfNotPresent
command: ["sh","-c","sleep 3600"]
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["myapp"]}
topologyKey: kubernetes.io/hostname
kubectl apply -f pod-required-affinity-demo.yaml
kubectl get pods -o wide
# 執行結果,兩個 pod 在同一 node 節點上
NAME READY STATUS RESTARTS AGE IP NODE
pod-first 1/1 Running 0 11s 10.244.1.6 node1
pod-second 1/1 Running 0 11s 10.244.1.5 node1
podAntiAffinity
Pod反親和性場景,當應用服務A和資料庫服務B要求儘量不要在同一臺節點上的時候。
kubectl delete -f pod-required-affinity-demo.yaml
cp pod-required-affinity-demo.yaml pod-required-anti-affinity-demo.yaml
vi pod-required-anti-affinity-demo.yaml
# 內容為
apiVersion: v1
kind: Pod
metadata:
name: pod-first
labels:
app: myapp
tier: frontend
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
---
apiVersion: v1
kind: Pod
metadata:
name: pod-second
labels:
app: backend
tier: db
spec:
containers:
- name: busybox
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["sh","-c","sleep 3600"]
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["myapp"]}
topologyKey: kubernetes.io/hostname
kubectl apply -f pod-required-anti-affinity-demo.yaml
kubectl get pods -o wide
# 執行結果,兩個 pod 不在同一個 node
NAME READY STATUS RESTARTS AGE IP NODE
pod-first 1/1 Running 0 5s 10.244.2.4 node2
pod-second 1/1 Running 0 5s 10.244.1.7 node1
kubectl delete -f pod-required-anti-affinity-demo.yaml
# 如果硬反親和性定義的標籤兩個節點都有,則第二個 Pod 沒法進行排程,如下面的的 zone=foo
# 給兩個 node 打上同一個標籤 zone=foo
kubectl label nodes node2 zone=foo
kubectl label nodes node1 zone=foo
vi pod-required-anti-affinity-demo.yaml
# 內容為:
apiVersion: v1
kind: Pod
metadata:
name: pod-first
labels:
app: myapp
tier: frontend
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
---
apiVersion: v1
kind: Pod
metadata:
name: pod-second
labels:
app: backend
tier: db
spec:
containers:
- name: busybox
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["sh","-c","sleep 3600"]
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["myapp"]}
topologyKey: zone
kubectl apply -f pod-required-anti-affinity-demo.yaml
kubectl get pods -o wide
# 結果如下,pod-second 沒法啟動
NAME READY STATUS RESTARTS AGE IP NODE
pod-first 1/1 Running 0 12s 10.244.1.8 node1
pod-second 0/1 Pending 0 12s <none> <none>
kubectl delete -f pod-required-anti-affinity-demo.yaml
汙點容忍排程(Taint和Toleration)
taints and tolerations 允許將某個節點做標記,以使得所有的pod都不會被排程到該節點上。但是如果某個pod明確制定了 tolerates 則可以正常排程到被標記的節點上。
# 可以使用命令列為 Node 節點新增 Taints:
kubectl taint nodes node1 key=value:NoSchedule
operator可以定義為:
Equal:表示key是否等於value,預設
Exists:表示key是否存在,此時無需定義value
tain 的 effect 定義對 Pod 排斥效果:
NoSchedule:僅影響排程過程,對現存的Pod物件不產生影響;
NoExecute:既影響排程過程,也影響顯著的Pod物件;不容忍的Pod物件將被驅逐
PreferNoSchedule: 表示儘量不排程
# 檢視節點的 taint
kubectl describe node master
kubectl get pods -n kube-system
kubectl describe pods kube-apiserver-master -n kube-system
# 為 node1 打上汙點
kubectl taint node node1 node-type=production:NoSchedule
vi deploy-demo.yaml
# 內容為:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deploy
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: myapp
release: canary
template:
metadata:
labels:
app: myapp
release: canary
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
ports:
- name: http
containerPort: 80
kubectl apply -f deploy-demo.yaml
kubectl get pods -o wide
# 執行結果:
NAME READY STATUS RESTARTS AGE IP NODE
myapp-deploy-69b47bc96d-cwt79 1/1 Running 0 5s 10.244.2.6 node2
myapp-deploy-69b47bc96d-qqrwq 1/1 Running 0 5s 10.244.2.5 node2
# 為 node2 打上汙點
kubectl taint node node2 node-type=dev:NoExecute
# NoExecute 將會驅逐沒有容忍該汙點的 pod,因兩個node節點都有汙點,pod沒有定義容忍,導致沒有節點可以啟動pod
kubectl get pods -o wide
# 執行結果:
NAME READY STATUS RESTARTS AGE IP NODE
myapp-deploy-69b47bc96d-psl8f 0/1 Pending 0 14s <none> <none>
myapp-deploy-69b47bc96d-q296k 0/1 Pending 0 14s <none> <none>
# 定義Toleration(容忍)
vi deploy-demo.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deploy
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: myapp
release: canary
template:
metadata:
labels:
app: myapp
release: canary
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v2
ports:
- name: http
containerPort: 80
tolerations:
- key: "node-type"
operator: "Equal"
value: "production"
effect: "NoSchedule"
kubectl apply -f deploy-demo.yaml
# pod 容忍 node1 的 tain ,可以在 node1 上執行
ubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
myapp-deploy-65cc47f858-tmpnz 1/1 Running 0 10s 10.244.1.10 node1
myapp-deploy-65cc47f858-xnklh 1/1 Running 0 13s 10.244.1.9 node1
# 定義Toleration,是否存在 node-type 這個key 且 effect 值為 NoSchedule
vi deploy-demo.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deploy
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: myapp
release: canary
template:
metadata:
labels:
app: myapp
release: canary
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v2
ports:
- name: http
containerPort: 80
tolerations:
- key: "node-type"
operator: "Exists"
value: ""
effect: "NoSchedule"
kubectl apply -f deploy-demo.yaml
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
myapp-deploy-559f559bcc-6jfqq 1/1 Running 0 10s 10.244.1.11 node1
myapp-deploy-559f559bcc-rlwp2 1/1 Running 0 9s 10.244.1.12 node1
##定義Toleration,是否存在 node-type 這個key 且 effect 值為空,則包含所有的值
vi deploy-demo.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deploy
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: myapp
release: canary
template:
metadata:
labels:
app: myapp
release: canary
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v2
ports:
- name: http
containerPort: 80
tolerations:
- key: "node-type"
operator: "Exists"
value: ""
effect: ""
kubectl apply -f deploy-demo.yaml
# 兩個 pod 均衡排程到兩個節點
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
myapp-deploy-5d9c6985f5-hn4k2 1/1 Running 0 2m 10.244.1.13 node1
myapp-deploy-5d9c6985f5-lkf9q 1/1 Running 0 2m 10.244.2.7 node2