python kubernetes 獲取 pod 的 cpu 佔用率

ponponon發表於2022-12-20
這個教程是使用 kubernetes 的 python client sdk 獲取 pod 的 cpu 佔用率,而不是透過 kubectl 命令!

kubernetes python client sdk

動機

我要做什麼?

最近有一個 pod ,是 rabbitmq 的消費者,但是會出現頻繁卡死的情況,所以我需要判斷 pod 是不是卡死了,然後重啟。

這個判斷沒有辦法透過一般的健康檢查發現

判斷依據:CPU 使用配額低於20m就認定為卡死,就刪除 pod(刪除 pod 之後,k8s 會重新一個新的)

技術方案

方案一:使用 shell+kubectl。但是我不喜歡 shell,也不喜歡解析非結構化的輸出,所以這個方案就淘汰了

方案二:使用 python + kubernetes sdk。我喜歡 python,而且這樣可以輸出結構化的資料結構,比如 json,方便我解析,good

所以,我才用方案二!

獲取一個『名稱空間』下的所有 pod

首先,我們要列出一個 namespace 下面所有的 pod

類似 kubectl get pod -n vddb

vddb 是 namespace 的 name
from kubernetes.client.models.v1_pod import V1Pod
from kubernetes.client.models.v1_pod_list import V1PodList
from kubernetes.client.models.v1_object_meta import V1ObjectMeta
from kubernetes import client, config

from kubernetes.client import ApiClient
from kubernetes.client.rest import RESTResponse
from loguru import logger

config.load_kube_config()

v1 = client.CoreV1Api()


namespaced_name = 'vddb'


pod_list: V1PodList = v1.list_namespaced_pod(namespaced_name)

for pod in pod_list.items:
    pod: V1Pod
    metadata: V1ObjectMeta = pod.metadata
    pod_name = metadata.name

獲取一個 pod 的 metrics

列出了 pod name 之後,我們就是獲取 pod 的對應的 metrics,比如使用的 CPU、記憶體配額

import json
from kubernetes.client.models.v1_pod import V1Pod
from kubernetes.client.models.v1_pod_list import V1PodList
from kubernetes.client.models.v1_object_meta import V1ObjectMeta
from kubernetes import client, config

from kubernetes.client import ApiClient
from kubernetes.client.rest import RESTResponse
from loguru import logger

config.load_kube_config()

v1 = client.CoreV1Api()
api_client = ApiClient()

namespaced_name = 'vddb'


pod_list: V1PodList = v1.list_namespaced_pod(namespaced_name)

for pod in pod_list.items:
    pod: V1Pod
    metadata: V1ObjectMeta = pod.metadata
    pod_name = metadata.name

    rest_response: RESTResponse = api_client.request(
        url=api_client.configuration.host +
        f'/apis/metrics.k8s.io/v1beta1/namespaces/{namespaced_name}/pods/{pod_name}',
        method='GET'
    )
    _data: str = rest_response.data
    data: dict = json.loads(_data)

    _cpu: str = data['containers'][0]['usage']['cpu']

    cpu = int(int(_cpu.removesuffix('n'))/1000/1000)

響應體的格式如下所示:

{
  "kind": "PodMetrics",
  "apiVersion": "metrics.k8s.io/v1beta1",
  "metadata": {
    "name": "svddb-servixxxxxxxxxxxxxxx4b4-bzs84",
    "namespace": "vddb",
    "selfLink": "/apis/metrics.k8s.io/v1beta1/namespaces/vddb/pods/svddbxxxxxxxxxxxxxxxx-bzs84",
    "creationTimestamp": "2022-12-16T14:40:46Z"
  },
  "timestamp": "2022-12-16T14:40:09Z",
  "window": "30s",
  "containers": [
    {
      "name": "svdxxxxxxxxrators",
      "usage": { "cpu": "2575748239n", "memory": "1257180Ki" }
    }
  ]
}
注意,這裡的 containers 是一個列表

刪除 pod

import json
from kubernetes.client.models.v1_pod import V1Pod
from kubernetes.client.models.v1_pod_list import V1PodList
from kubernetes.client.models.v1_object_meta import V1ObjectMeta
from kubernetes import client, config

from kubernetes.client import ApiClient
from kubernetes.client.rest import RESTResponse
from loguru import logger

config.load_kube_config()

v1 = client.CoreV1Api()
api_client = ApiClient()

namespaced_name = 'vddb'


pod_list: V1PodList = v1.list_namespaced_pod(namespaced_name)

for pod in pod_list.items:
    pod: V1Pod
    metadata: V1ObjectMeta = pod.metadata
    pod_name = metadata.name

    rest_response: RESTResponse = api_client.request(
        url=api_client.configuration.host +
        f'/apis/metrics.k8s.io/v1beta1/namespaces/{namespaced_name}/pods/{pod_name}',
        method='GET'
    )
    _data: str = rest_response.data
    data: dict = json.loads(_data)

    _cpu: str = data['containers'][0]['usage']['cpu']

    cpu = int(int(_cpu.removesuffix('n'))/1000/1000)

    if 'svddb-service-generators-server-prod' in pod_name and cpu < 20:
        v1.delete_namespaced_pod(pod_name, namespaced_name)

參考教程:
Get cpu and memory usage through in cluster config
Does the library support "kubectl top pod" api?
https://kubernetes.io/docs/tasks/debug/debug-cluster/resource-metrics-pipeline/

相關文章