擁抱藍綠部署,推動AKS群集版本平滑升級(下篇)

微軟技術棧發表於2021-12-11

在本系列文章的上篇,我們已經介紹了AKS藍綠部署的基本思路,並介紹瞭如何部署相關資源並將應用閘道器與AKS進行整合錯過上篇的小夥伴,還請點選這裡回看。

本篇我們將基於上篇的內容,進一步介紹如何部署應用,如何部署AKS新叢集,以及如何對AKS版本進行切換。

事不宜遲,這就開始吧!

應用部署

我們來部署一個演示的應用,驗證應用閘道器與AKS叢集已經成功整合。請把以下YAML原始碼複製另存為deployment_aspnet.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: aspnetapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: aspnetapp
  template:
    metadata:
      labels:
        app: aspnetapp
    spec:
      containers:
        - name: aspnetapp
          # Sample ASP.Net application from Microsoft which can get private IP.
          image: mcr.microsoft.com/dotnet/core/samples:aspnetapp
          ports:
          - containerPort: 80
---

apiVersion: v1
kind: Service
metadata:
  name: aspnetapp
spec:
  selector:
    app: aspnetapp
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80

---

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: aspnetapp
  annotations:
    kubernetes.io/ingress.class: azure/application-gateway
spec:
  rules:
  - http:
      paths:
      - path: /
        backend:
          serviceName: aspnetapp
          servicePort: 80

執行下列命令部署這個應用:

kubectl apply -f deployment_aspnet.yaml

列表檢視Pod,確認應用部署已執行:


kubectl get po -o wide
NAME                                    READY   STATUS    RESTARTS   AGE    IP            NODE                                NOMINATED NODE   READINESS GATES
aad-pod-identity-mic-787c5958fd-kmx9b   1/1     Running   0          177m   10.240.0.33   aks-nodepool1-94448771-vmss000000   <none>           <none>
aad-pod-identity-mic-787c5958fd-nkpv4   1/1     Running   0          177m   10.240.0.63   aks-nodepool1-94448771-vmss000001   <none>           <none>
aad-pod-identity-nmi-mhp86              1/1     Running   0          177m   10.240.0.4    aks-nodepool1-94448771-vmss000000   <none>           <none>
aad-pod-identity-nmi-sjpvw              1/1     Running   0          177m   10.240.0.35   aks-nodepool1-94448771-vmss000001   <none>           <none>
aad-pod-identity-nmi-xnfxh              1/1     Running   0          177m   10.240.0.66   aks-nodepool1-94448771-vmss000002   <none>           <none>
agic-ingress-azure-84967fc5b6-cqcn4     1/1     Running   0          111m   10.240.0.79   aks-nodepool1-94448771-vmss000002   <none>           <none>
aspnetapp-68784d6544-j99qg              1/1     Running   0          96    10.240.0.75   aks-nodepool1-94448771-vmss000002   <none>           <none>
aspnetapp-68784d6544-v9449              1/1     Running   0          96    10.240.0.13   aks-nodepool1-94448771-vmss000000   <none>           <none>
aspnetapp-68784d6544-ztbd9              1/1     Running   0          96    10.240.0.50   aks-nodepool1-94448771-vmss000001   <none>           <none>

可以看到應用的Pod都正常執行起來了。注意它們的IP是10.240.0.13,10.240.0.50和10.240.0.75。

應用閘道器後端可以看到就是上述IP:

az network application-gateway show-backend-health \
 -g $RESOURCE_GROUP \
 -n $APP_GATEWAY \
 --query backendAddressPools[].backendHttpSettingsCollection[].servers[][address,health]
-o tsv
10.240.0.13     Healthy
10.240.0.50     Healthy
10.240.0.75     Healthy

執行如下命令檢查前端的IP地址:

az network public-ip show -g $RESOURCE_GROUP -n $APPGW_IP --query ipAddress -o tsv

然後用瀏覽器訪問這個IP,即可看到:

ed69b4cf189b2b5cc54fc84407819e37.png

多重新整理幾次,會發現Host name和Server IP address會輪流顯示3主機名和IP,正是前面部署的Pod的3個Pod名和內網IP。說明應用閘道器和AKS中的Pod整合已經順利實現。

部署AKS新叢集

建立新版本的AKS叢集

在第2個AKS子網中,建立一套新的AKS叢集。我們之前的AKS版本使用的是當前預設版本1.19.11,新的AKS叢集使用1.20.7,其它引數全都保持不變。

宣告新AKS叢集名稱的變數:

AKS_NEW=new

獲取新叢集所在子網的ID:

NEW_AKS_SUBNET_ID=$(az network vnet subnet show -g $RESOURCE_GROUP --vnet-name $VNET_NAME --name $NEW_AKS_SUBNET --query id -o tsv)

建立新AKS叢集:

az aks create -n $AKS_NEW \
-g $RESOURCE_GROUP \
-l $AZ_REGION \
--generate-ssh-keys \
--network-plugin azure \
--enable-managed-identity \
--vnet-subnet-id $NEW_AKS_SUBNET_ID \
--kubernetes-version 1.20.7

新的AKS叢集依然使用Helm安裝application-gateway-kubernetes-ingress。

在新版AKS叢集上安裝Pod Identify

連線AKS叢集:

az aks get-credentials --resource-group $RESOURCE_GROUP --name $AKS_NEW

安裝AAD Pod Identify:

kubectl create serviceaccount --namespace kube-system tiller-sa
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller-sa
helm repo add aad-pod-identity https://raw.githubusercontent.com/Azure/aad-pod-identity/master/charts
helm install aad-pod-identity aad-pod-identity/aad-pod-identity

Helm安裝Application Gateway Ingress Controller:

helm repo add application-gateway-kubernetes-ingress https://appgwingress.blob.core.windows.net/ingress-azure-helm-package/
helm repo update

在新版AKS群集上部署應用

首先給新AKS叢集也安裝上相同應用:
kubectl apply -f deployment_aspnet.yaml

應用部署好後,列表一下Pod:

kubectl get po -o=custom-columns=NAME:.metadata.name,\
podIP:.status.podIP,NODE:.spec.nodeName,\
READY-true:.status.containerStatuses[*].ready

NAME                                    podIP          NODE                                READY-true
aad-pod-identity-mic-644c7c9f6-cqkxr   10.241.0.25   aks-nodepool1-20247409-vmss000000   true
aad-pod-identity-mic-644c7c9f6-xpwlt   10.241.0.67   aks-nodepool1-20247409-vmss000002   true
aad-pod-identity-nmi-k2c8s             10.241.0.35   aks-nodepool1-20247409-vmss000001   true
aad-pod-identity-nmi-vqqzq             10.241.0.66   aks-nodepool1-20247409-vmss000002   true
aad-pod-identity-nmi-xvcxm             10.241.0.4    aks-nodepool1-20247409-vmss000000   true
aspnetapp-5844845bdc-82lcw             10.241.0.33   aks-nodepool1-20247409-vmss000000   true
aspnetapp-5844845bdc-hskvg             10.241.0.43   aks-nodepool1-20247409-vmss000001   true
aspnetapp-5844845bdc-qzt7f             10.241.0.84   aks-nodepool1-20247409-vmss000002   true

實際生產操作流程中,部署好應用後,先不要關聯到現有的應用閘道器,而是遠端登入上去,通過內網IP訪問測試一下:

kubectl run -it --rm aks-ssh --image=mcr.microsoft.com/aks/fundamental/base-ubuntu:v0.0.11

容器啟動起來後會直接進入這個容器,我們訪問一下前述3個內網IP:10.241.0.33、10.241.0.43、10.241.0.84。比如:

root@aks-ssh:/# curl http://10.241.0.33
root@aks-ssh:/# curl http://10.241.0.43
root@aks-ssh:/# curl http://10.241.0.84

看到都可以正常返回內容。這可以演示作新環境已經測試通過,最後把這個新AKS叢集與現有的應用閘道器關聯上。

切換不同版本的AKS叢集

應用閘道器切換到與新版本的AKS整合
執行以下命令安裝AGIC:

helm install agic application-gateway-kubernetes-ingress/ingress-azure -f helm_agic.yaml

稍等幾秒鐘後,執行:

kubectl get po -o=custom-columns=NAME:. metadata.name,podIP:.status.podIP,NODE:.spec.nodeName,READY-true:.status.containerStatuses[*].ready
NAME                                    podIP          NODE                                READY-true
aad-pod-identity-mic-644c7c9f6-cqkxr   10.241.0.25   aks-nodepool1-20247409-vmss000000   true
aad-pod-identity-mic-644c7c9f6-xpwlt   10.241.0.67   aks-nodepool1-20247409-vmss000002   true
aad-pod-identity-nmi-k2c8s             10.241.0.35   aks-nodepool1-20247409-vmss000001   true
aad-pod-identity-nmi-vqqzq             10.241.0.66   aks-nodepool1-20247409-vmss000002   true
aad-pod-identity-nmi-xvcxm             10.241.0.4    aks-nodepool1-20247409-vmss000000   true
agic-ingress-azure-84967fc5b6-6x4dd    10.241.0.79   aks-nodepool1-20247409-vmss000002   true
aspnetapp-5844845bdc-82lcw             10.241.0.33   aks-nodepool1-20247409-vmss000000   true
aspnetapp-5844845bdc-hskvg             10.241.0.43   aks-nodepool1-20247409-vmss000001   true
aspnetapp-5844845bdc-qzt7f             10.241.0.84   aks-nodepool1-20247409-vmss000002   true

可以看到agic-ingress-azure-*這個Pod已經正常執行起來了。

先用命令列檢視一下應用閘道器的後端已經更新成新的Pod了:

az network application-gateway show-backend-health \
-g $RESOURCE_GROUP \
-n $APP_GATEWAY \
--query backendAddressPools[].backendHttpSettingsCollection[].servers[][address,health]
-o tsv
10.241.0.33     Healthy
10.241.0.43     Healthy
10.241.0.84     Healthy

再回到瀏覽器重新整理應用閘道器的公網IP,可以看到顯示的內容中Host name和IP已經切換成新的後端了:

4e376df72cd0b694c149a8cd743d0d00.png

版本回滾

假如新版AKS叢集有故障,我們需要切換回舊AKS叢集。此時只需要回到舊AKS叢集,重新安裝AGIC讓應用閘道器重新關聯到舊AKS叢集中的應用Pod就可以了。

為此,首先執行:

az aks get-credentials --resource-group $RESOURCE_GROUP --name $AKS_OLD

隨後再執行:

helm uninstall agic
helm install agic application-gateway-kubernetes-ingress/ingress-azure -f helm_agic.yaml

很快可以看到AGIC的Pod已經執行起來:

kubectl get po -o wide
NAME                                    READY   STATUS    RESTARTS   AGE    IP            NODE                                NOMINATED NODE   READINESS GATES
aad-pod-identity-mic-787c5958fd-kmx9b   1/1     Running   0          2d1h   10.240.0.33   aks-nodepool1-94448771-vmss000000   <none>           <none>
aad-pod-identity-mic-787c5958fd-nkpv4   1/1     Running   1          2d1h   10.240.0.63   aks-nodepool1-94448771-vmss000001   <none>           <none>
aad-pod-identity-nmi-mhp86              1/1     Running   0          2d1h   10.240.0.4    aks-nodepool1-94448771-vmss000000   <none>           <none>
aad-pod-identity-nmi-sjpvw              1/1     Running   0          2d1h   10.240.0.35   aks-nodepool1-94448771-vmss000001   <none>           <none>
aad-pod-identity-nmi-xnfxh              1/1     Running   0          2d1h   10.240.0.66   aks-nodepool1-94448771-vmss000002   <none>           <none>
agic-ingress-azure-84967fc5b6-nwbh4     1/1     Running   0          8s     10.240.0.70   aks-nodepool1-94448771-vmss000002   <none>           <none>
aspnetapp-68784d6544-j99qg              1/1     Running   0          2d     10.240.0.75   aks-nodepool1-94448771-vmss000002   <none>           <none>
aspnetapp-68784d6544-v9449              1/1     Running   0          2d     10.240.0.13   aks-nodepool1-94448771-vmss000000   <none>           <none>
aspnetapp-68784d6544-ztbd9              1/1     Running   0          2d     10.240.0.50   aks-nodepool1-94448771-vmss000001   <none>     

再看應用閘道器後端:

az network application-gateway show-backend-health \
 -g $RESOURCE_GROUP \
 -n $APP_GATEWAY \
 --query backendAddressPools[].backendHttpSettingsCollection[].servers[][address,health]
-o tsv
10.240.0.13     Healthy
10.240.0.50     Healthy
10.240.0.75     Healthy

可以看到,同一個應用閘道器後端已經恢復回舊的AKS叢集的IP了。

版本切換時應用可用性測試

我們用連續的HTTP請求驗證一下切換期間服務沒有中斷。

另開一個命令列視窗,執行:

while(true); \

do curl -s http://139.217.117.86/ |ts '[%Y-%m-%d %H:%M:%S]' | grep 10.24; \

sleep 0.1; done

[2021-08-03 16:35:09] 10.240.0.13                        <br />

[2021-08-03 16:35:10] 10.240.0.50                        <br />

[2021-08-03 16:35:11] 10.240.0.13                        <br />

[2021-08-03 16:35:12] 10.240.0.75                        <br />

[2021-08-03 16:35:12] 10.240.0.50                        <br />

[2021-08-03 16:35:13] 10.240.0.13                        <br />

[2021-08-03 16:35:14] 10.240.0.75                        <br />

可以看到返回的是舊AKS叢集中pod私有IP輪流輸出。

再回到前面AKS操作的視窗,切換到新的AKS叢集,再次執行刪除和安裝AGIC的命令:

az aks get-credentials --resource-group $RESOURCE_GROUP --name $AKS_NEW

再執行:

helm uninstall agic

在第二個視窗觀察,會發現返回的仍然是舊AKS叢集的IP。因為此時我們只在新AKS叢集操作刪除,應用閘道器和舊AKS叢集都在正常執行。

再在新AKS叢集上執行:

helm install agic application-gateway-kubernetes-ingress/ingress-azure -f helm_agic.yaml

在第二個視窗觀察,會發現從某一行起直接替換成了新AKS叢集的IP地址,沒有任何中斷:

[2021-08-03 16:42:08] 10.240.0.13                        <br />

[2021-08-03 16:42:09] 10.240.0.50                        <br />

[2021-08-03 16:42:09] 10.240.0.75                        <br />

[2021-08-03 16:42:10] 10.240.0.13                        <br />

[2021-08-03 16:42:11] 10.240.0.50                        <br />

[2021-08-03 16:42:11] 10.240.0.75                        <br />

[2021-08-03 16:42:12] 10.241.0.33                        <br />

[2021-08-03 16:42:13] 10.241.0.33                        <br />

[2021-08-03 16:42:13] 10.241.0.43                        <br />

[2021-08-03 16:42:15] 10.241.0.43                        <br />

[2021-08-03 16:42:15] 10.241.0.84                        <br />

[2021-08-03 16:42:16] 10.241.0.84                        <br />

由此驗證了切換過程中應用閘道器對外的服務始終正常執行。通過這樣的操作,最終可以實現新舊版AKS叢集同時保留,並且可以實時切換。

總結

以上是以常見的Web應用為例,演示了新建AKS叢集通過藍綠部署實現穩妥地版本升級。

除了Web應用以外,其它型別和場景的應用都可以參照,在AKS叢集和上游整合的地方進行切換,從而實現實時切換和回滾。

相關文章