K8S 高階排程方式

klvchen發表於2018-11-29

可以使用高階排程分為:

  • 節點選擇器: nodeSelector、nodeName
  • 節點親和性排程: nodeAffinity
  • Pod親和性排程:PodAffinity
  • Pod反親和性排程:podAntiAffinity

nodeSelector, nodeName

cd; mkdir schedule; cd schedule/

vi pod-demo.yaml
# 內容為
apiVersion: v1
kind: Pod
metadata:
  name: pod-demo
  labels:
    app: myapp
    tier: frontend
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
  nodeSelector:
    disktype: harddisk


kubectl apply -f pod-demo.yaml 
kubectl get pods

kubectl describe pod  pod-demo
# 執行結果:
Warning  FailedScheduling  2m3s (x25 over 3m15s)  default-scheduler  0/3 nodes are available: 3 node(s) didn`t match node selector.

# 打上標籤
kubectl label node node2 disktype=harddisk

# 正常啟動
kubectl get pods

nodeAffinity

requiredDuringSchedulingIgnoredDuringExecution 硬親和性 必須滿足親和性。
preferredDuringSchedulingIgnoredDuringExecution 軟親和性 能滿足最好,不滿足也沒關係。

硬親和性:
matchExpressions : 匹配表示式,這個標籤可以指定一段,例如pod中定義的key為zone,operator為In(包含那些),values為 foo和bar。就是在node節點中包含foo和bar的標籤中排程
matchFields : 匹配欄位 和上面的意思 不過他可以不定義標籤值,可以定義

# 選擇在 node 有 zone 標籤值為 foo 或 bar 值的節點上執行 pod
vi pod-nodeaffinity-demo.yaml 
# 內容為
apiVersion: v1
kind: Pod
metadata:
  name: pod-node-affinity-demo
  labels:
    app: myapp
    tier: frontend
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: zone
            operator: In
            values:
            - foo
            - bar

kubectl apply -f pod-nodeaffinity-demo.yaml

kubectl describe pod pod-node-affinity-demo
# 執行結果:
Warning  FailedScheduling  2s (x8 over 20s)  default-scheduler  0/3 nodes are available: 3 node(s) didn`t match node selector.

# 給其中一個node打上foo的標籤
kubectl label node node1 zone=foo

# 正常啟動
kubectl get pods

軟親和性 :

cp pod-nodeaffinity-demo.yaml pod-nodeaffinity-demo-2.yaml 

vi pod-nodeaffinity-demo-2.yaml 
# 內容為
apiVersion: v1
kind: Pod
metadata:
  name: pod-node-affinity-demo-2
  labels:
    app: myapp
    tier: frontend
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - preference:
          matchExpressions:
          - key: zone
            operator: In
            values:
            - foo
            - bar
        weight: 60

kubectl apply -f pod-nodeaffinity-demo-2.yaml

podAffinity

Pod親和性場景,我們的k8s叢集的節點分佈在不同的區域或者不同的機房,當服務A和服務B要求部署在同一個區域或者同一機房的時候,我們就需要親和性排程了。

labelSelector : 選擇跟那組Pod親和
namespaces : 選擇哪個名稱空間
topologyKey : 指定節點上的哪個鍵

kubectl get pods
kubectl delete pod pod-node-affinity-demo pod-node-affinity-demo-2 pod-demo

cd ~/schedule/

vi pod-required-affinity-demo.yaml 
# 內容為:
apiVersion: v1
kind: Pod
metadata:
  name: pod-first
  labels:
    app: myapp
    tier: frontend
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
---
apiVersion: v1
kind: Pod
metadata:
  name: pod-second
  labels:
    app: db
    tier: db
spec:
  containers:
  - name: busybox
    image: busybox
    imagePullPolicy: IfNotPresent
    command: ["sh","-c","sleep 3600"]
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - {key: app, operator: In, values: ["myapp"]}
        topologyKey: kubernetes.io/hostname


kubectl apply -f pod-required-affinity-demo.yaml 

kubectl get pods -o wide
# 執行結果,兩個 pod 在同一 node 節點上
NAME         READY   STATUS    RESTARTS   AGE   IP           NODE
pod-first    1/1     Running   0          11s   10.244.1.6   node1
pod-second   1/1     Running   0          11s   10.244.1.5   node1

podAntiAffinity

Pod反親和性場景,當應用服務A和資料庫服務B要求儘量不要在同一臺節點上的時候。

kubectl delete -f pod-required-affinity-demo.yaml 

cp pod-required-affinity-demo.yaml pod-required-anti-affinity-demo.yaml 

vi pod-required-anti-affinity-demo.yaml 
# 內容為
apiVersion: v1
kind: Pod
metadata:
  name: pod-first
  labels:
    app: myapp
    tier: frontend
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
---
apiVersion: v1
kind: Pod
metadata:
  name: pod-second
  labels:
    app: backend
    tier: db
spec:
  containers:
  - name: busybox
    image: busybox:latest
    imagePullPolicy: IfNotPresent
    command: ["sh","-c","sleep 3600"]
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - {key: app, operator: In, values: ["myapp"]}
        topologyKey: kubernetes.io/hostname

kubectl apply -f pod-required-anti-affinity-demo.yaml 

kubectl get pods -o wide
# 執行結果,兩個 pod 不在同一個 node
NAME         READY   STATUS    RESTARTS   AGE   IP           NODE
pod-first    1/1     Running   0          5s    10.244.2.4   node2
pod-second   1/1     Running   0          5s    10.244.1.7   node1

kubectl delete -f pod-required-anti-affinity-demo.yaml 


# 如果硬反親和性定義的標籤兩個節點都有,則第二個 Pod 沒法進行排程,如下面的的 zone=foo
# 給兩個 node 打上同一個標籤 zone=foo
kubectl label nodes node2 zone=foo
kubectl label nodes node1 zone=foo

vi pod-required-anti-affinity-demo.yaml 
# 內容為:
apiVersion: v1
kind: Pod
metadata:
  name: pod-first
  labels:
    app: myapp
    tier: frontend
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
---
apiVersion: v1
kind: Pod
metadata:
  name: pod-second
  labels:
    app: backend
    tier: db
spec:
  containers:
  - name: busybox
    image: busybox:latest
    imagePullPolicy: IfNotPresent
    command: ["sh","-c","sleep 3600"]
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - {key: app, operator: In, values: ["myapp"]}
        topologyKey: zone


kubectl apply -f pod-required-anti-affinity-demo.yaml 

kubectl get pods -o wide
# 結果如下,pod-second 沒法啟動
NAME         READY   STATUS    RESTARTS   AGE   IP           NODE
pod-first    1/1     Running   0          12s   10.244.1.8   node1
pod-second   0/1     Pending   0          12s   <none>       <none>

kubectl delete -f pod-required-anti-affinity-demo.yaml

汙點容忍排程(Taint和Toleration)

taints and tolerations 允許將某個節點做標記,以使得所有的pod都不會被排程到該節點上。但是如果某個pod明確制定了 tolerates 則可以正常排程到被標記的節點上。

# 可以使用命令列為 Node 節點新增 Taints:
kubectl taint nodes node1 key=value:NoSchedule

operator可以定義為:
Equal:表示key是否等於value,預設
Exists:表示key是否存在,此時無需定義value

tain 的 effect 定義對 Pod 排斥效果:
NoSchedule:僅影響排程過程,對現存的Pod物件不產生影響;
NoExecute:既影響排程過程,也影響顯著的Pod物件;不容忍的Pod物件將被驅逐
PreferNoSchedule: 表示儘量不排程

# 檢視節點的 taint
kubectl describe node master
kubectl get pods -n kube-system
kubectl describe pods kube-apiserver-master -n kube-system

# 為 node1 打上汙點
kubectl taint node node1 node-type=production:NoSchedule

vi deploy-demo.yaml 
# 內容為:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-deploy
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
      release: canary
  template:
    metadata:
      labels:
        app: myapp
        release: canary
    spec:
      containers:
      - name: myapp
        image: ikubernetes/myapp:v1
        ports:
        - name: http
          containerPort: 80



kubectl apply -f deploy-demo.yaml 

kubectl get pods -o wide
# 執行結果:
NAME                            READY   STATUS    RESTARTS   AGE   IP           NODE
myapp-deploy-69b47bc96d-cwt79   1/1     Running   0          5s    10.244.2.6   node2
myapp-deploy-69b47bc96d-qqrwq   1/1     Running   0          5s    10.244.2.5   node2


# 為 node2 打上汙點
kubectl taint node node2 node-type=dev:NoExecute

# NoExecute 將會驅逐沒有容忍該汙點的 pod,因兩個node節點都有汙點,pod沒有定義容忍,導致沒有節點可以啟動pod
kubectl get pods -o wide
# 執行結果:
NAME                            READY   STATUS    RESTARTS   AGE   IP       NODE
myapp-deploy-69b47bc96d-psl8f   0/1     Pending   0          14s   <none>   <none>
myapp-deploy-69b47bc96d-q296k   0/1     Pending   0          14s   <none>   <none>


# 定義Toleration(容忍)
vi deploy-demo.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-deploy
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
      release: canary
  template:
    metadata:
      labels:
        app: myapp
        release: canary
    spec:
      containers:
      - name: myapp
        image: ikubernetes/myapp:v2
        ports:
        - name: http
          containerPort: 80
      tolerations:
      - key: "node-type"
        operator: "Equal"
        value: "production"
        effect: "NoSchedule"


kubectl apply -f deploy-demo.yaml

# pod 容忍 node1 的 tain ,可以在 node1 上執行
ubectl get pods -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP            NODE
myapp-deploy-65cc47f858-tmpnz   1/1     Running   0          10s   10.244.1.10   node1
myapp-deploy-65cc47f858-xnklh   1/1     Running   0          13s   10.244.1.9    node1


# 定義Toleration,是否存在 node-type 這個key 且 effect 值為 NoSchedule 
vi deploy-demo.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-deploy
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
      release: canary
  template:
    metadata:
      labels:
        app: myapp
        release: canary
    spec:
      containers:
      - name: myapp
        image: ikubernetes/myapp:v2
        ports:
        - name: http
          containerPort: 80
      tolerations:
      - key: "node-type"
        operator: "Exists"
        value: ""
        effect: "NoSchedule"

kubectl apply -f deploy-demo.yaml

kubectl get pods -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP            NODE
myapp-deploy-559f559bcc-6jfqq   1/1     Running   0          10s   10.244.1.11   node1
myapp-deploy-559f559bcc-rlwp2   1/1     Running   0          9s    10.244.1.12   node1


##定義Toleration,是否存在 node-type 這個key 且 effect 值為空,則包含所有的值 
vi deploy-demo.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-deploy
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
      release: canary
  template:
    metadata:
      labels:
        app: myapp
        release: canary
    spec:
      containers:
      - name: myapp
        image: ikubernetes/myapp:v2
        ports:
        - name: http
          containerPort: 80
      tolerations:
      - key: "node-type"
        operator: "Exists"
        value: ""
        effect: ""

kubectl apply -f deploy-demo.yaml

# 兩個 pod 均衡排程到兩個節點
kubectl get pods -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP            NODE
myapp-deploy-5d9c6985f5-hn4k2   1/1     Running   0          2m    10.244.1.13   node1
myapp-deploy-5d9c6985f5-lkf9q   1/1     Running   0          2m    10.244.2.7    node2

相關文章