21-pod-podLifecycle

cucytoman發表於2019-09-30

concepts/workloads/pods/pod-lifecycle/

主要是描述Pods生命週期:

Pod phase

A Pod’s status field is a PodStatus object, which has a phase field. 一個Podsstatus欄位是一個PodStatus物件,它有一個phase欄位

The phase of a Pod is a simple, high-level summary of where the Pod is in its lifecycle. The phase is not intended to be a comprehensive rollup of observations of Container or Pod state, nor is it intended to be a comprehensive state machine.podphase是對pod在其生命週期中所處位置的簡單、高階總結。該階段不打算是對容器或pod狀態觀測的綜合彙總,也不打算是一個綜合狀態機。

The number and meanings of Pod phase values are tightly guarded. Other than what is documented here, nothing should be assumed about Pods that have a given phase value.
pod相位值的數量和意義受到嚴格保護。除了這裡記錄的內容外,對於具有給定相位值的pod,不應假設任何內容。

以下是可能的值 phase:

Value Description
Pending The Pod has been accepted by the Kubernetes system, but one or more of the Container images has not been created. This includes time before being scheduled as well as time spent downloading images over the network, which could take a while.該pod已被kubernetes系統接受,但一個或多個容器映象尚未建立。這包括預定之前的時間以及通過網路下載影像所花費的時間,這可能需要一段時間。
Running The Pod has been bound to a node, and all of the Containers have been created. At least one Container is still running, or is in the process of starting or restarting.pod已繫結到一個節點,並且所有容器都已建立。至少有一個容器仍在執行,或者正在啟動或重新啟動。
Succeeded All Containers in the Pod have terminated in success, and will not be restarted.POD中的所有容器都已成功終止,並且不會重新啟動。
Failed All Containers in the Pod have terminated, and at least one Container has terminated in failure. That is, the Container either exited with non-zero status or was terminated by the system. pod中的所有容器都已終止,並且至少有一個容器因故障而終止。也就是說,容器要麼退出非零狀態,要麼被系統終止。
Unknown For some reason the state of the Pod could not be obtained, typically due to an error in communicating with the host of the Pod. 由於某種原因,無法獲得pod的狀態,通常是由於與pod主機通訊時出錯。

Pod conditions

A Pod has a PodStatus, which has an array of) through which the Pod has or has not passed. Each element of the PodCondition array has six possible fields:一個pod有一個PodStatus,它有一系列 PodConditions,pod已經或沒有通過這些PodCondition。PodCondition為一個陣列的每個元素都有六個可能的欄位:

  • The lastProbeTime field provides a timestamp for when the Pod condition was last probed. 提供上次探測POD條件的時間戳。
  • The lastTransitionTime field provides a timestamp for when the Pod last transitioned from one status to another.提供POD上次從一種狀態轉換到另一種狀態的時間戳。
  • The message field is a human-readable message indicating details about the transition.是一條人類可讀的訊息,指示有關轉換的詳細資訊。
  • The reason field is a unique, one-word, CamelCase reason for the condition’s last transition. 是一個獨特的,一個詞,CamelCase病例的原因,為條件的最後過渡。
  • The status field is a string, with possible values “True”, “False”, and “Unknown”.
  • The type field is a string with the following possible values:
    • PodScheduled: the Pod has been scheduled to a node; pod已被排程到一個節點
    • Ready: the Pod is able to serve requests and should be added to the load balancing pools of all matching Services; POD能夠服務請求,並且應該被新增到所有匹配服務的負載平衡池中。
    • Initialized: all init containers have started successfully 所有容器初始化成功
    • Unschedulable: the scheduler cannot schedule the Pod right now, for example due to lack of resources or other constraints 排程程式現在無法排程pod,例如由於缺少資源或其他限制
    • ContainersReady: all containers in the Pod are ready. pod裡的所有集裝箱都準備好了

Container probes

探針 在容器上定期進行的診斷,為了執行診斷,kubelet呼叫由容器實現的處理程式 Handler 。有三種型別的處理程式:

  • ExecAction: Executes a specified command inside the Container. The diagnostic is considered successful if the command exits with a status code of 0.
  • TCPSocketAction: Performs a TCP check against the Container’s IP address on a specified port. The diagnostic is considered successful if the port is open.
  • HTTPGetAction: Performs an HTTP Get request against the Container’s IP address on a specified port and path. The diagnostic is considered successful if the response has a status code greater than or equal to 200 and less than 400.

Each probe has one of three results:

  • Success: The Container passed the diagnostic. 容器通過了診斷
  • Failure: The Container failed the diagnostic. 容器未能通過診斷
  • Unknown: The diagnostic failed, so no action should be taken. 診斷失敗,不採取任何操作

The kubelet can optionally perform and react to three kinds of probes on running Containerskubelet可以選擇在執行的容器上執行和響應三種探針:

  • livenessProbe: 指示容器是否正在執行。如果活性探針失敗,kubelet會殺死容器,容器會受到 restart policy. 如果容器不提供活動探測,則預設狀態為 Success.
  • readinessProbe: Indicates whether the Container is ready to service requests. If the readiness probe fails, the endpoints controller removes the Pod’s IP address from the endpoints of all Services that match the Pod. The default state of readiness before the initial delay is Failure. If a Container does not provide a readiness probe, the default state is Success.指示容器是否已準備好為請求提供服務。如果就緒性探測失敗,端點控制器將從與pod匹配的所有服務的端點移除pod的ip地址。初始延遲之前的預設就緒狀態是failure。如果容器未提供就緒探測,則預設狀態為success。
  • startupProbe: Indicates whether the application within the Container is started. All other probes are disabled if a startup probe is provided, until it succeeds. If the startup probe fails, the kubelet kills the Container, and the Container is subjected to its restart policy. If a Container does not provide a startup probe, the default state is Success. 指示是否啟動容器中的應用程式。如果提供了啟動探測,則禁用所有其他探測,直到成功為止。如果啟動探測失敗,kubelet將殺死容器,容器將受其重新啟動策略的約束。如果容器不提供啟動探測,則預設狀態為success。

When should you use a liveness probe?

FEATURE STATE: Kubernetes v1.0 stable

If the process in your Container is able to crash on its own whenever it encounters an issue or becomes unhealthy, you do not necessarily need a liveness probe; the kubelet will automatically perform the correct action in accordance with the Pod’s restartPolicy. 如果容器中的程式在遇到問題或變得不健康時能夠自行崩潰,則不一定需要活動探測器;kubelet將根據pod的重新啟動策略自動執行正確的操作。

If you’d like your Container to be killed and restarted if a probe fails, then specify a liveness probe, and specify a restartPolicy of Always or OnFailure. 如果您希望容器被殺死並在探針失敗時重新啟動,則指定一個活動性探針,並指定一個始終或OnDebug的重新啟動策略。

When should you use a readiness probe?

FEATURE STATE: Kubernetes v1.0 stable

If you’d like to start sending traffic to a Pod only when a probe succeeds, specify a readiness probe. In this case, the readiness probe might be the same as the liveness probe, but the existence of the readiness probe in the spec means that the Pod will start without receiving any traffic and only start receiving traffic after the probe starts succeeding. If your Container needs to work on loading large data, configuration files, or migrations during startup, specify a readiness probe. 如果只想在探測成功時才開始向pod傳送流量,請指定就緒探測。在這種情況下,就緒探測可能與活動探測相同,但規範中存在就緒探測意味著POD將在不接收任何通訊量的情況下啟動,並且僅在探測開始成功後才開始接收通訊量。如果容器需要在啟動期間載入大資料、配置檔案或遷移,請指定就緒探測。

If you want your Container to be able to take itself down for maintenance, you can specify a readiness probe that checks an endpoint specific to readiness that is different from the liveness probe. 如果您希望容器能夠自行停機進行維護,則可以指定一個就緒探測,該探測檢查特定於就緒的端點,該端點不同於活動探測。

Note that if you just want to be able to drain requests when the Pod is deleted, you do not necessarily need a readiness probe; on deletion, the Pod automatically puts itself into an unready state regardless of whether the readiness probe exists. The Pod remains in the unready state while it waits for the Containers in the Pod to stop. 注意,如果您只想在pod被刪除時排出請求,則不一定需要就緒探測;在刪除時,pod會自動將自己置於未就緒狀態,而不管就緒探測是否存在。當POD等待POD中的容器停止時,POD保持未就緒狀態。

When should you use a startup probe?

FEATURE STATE: Kubernetes v1.16 alpha

If your Container usually starts in more than initialDelaySeconds + failureThreshold × periodSeconds, you should specify a startup probe that checks the same endpoint as the liveness probe. The default for periodSeconds is 30s. You should then set its failureThreshold high enough to allow the Container to start, without changing the default values of the liveness probe. This helps to protect against deadlocks. 如果容器的啟動時間通常超過initialDelaySeconds+failureReshold×periodSeconds,則應指定一個啟動探測,該探測檢查與活動探測相同的端點。PeriodSeconds的預設值為30s。然後應將其FailuReshold設定得足夠高,以允許容器啟動,而不更改Liveness Probe的預設值。這有助於防止死鎖。

For more information about how to set up a liveness, readiness, startup probe, see Configure Liveness, Readiness and Startup Probes. 有關如何設定活躍性、準備性、啟動探針的更多資訊,請參見配置活動性、準備性和啟動探針。

Pod and Container status

For detailed information about Pod Container status, see PodStatus and ContainerStatus. Note that the information reported as Pod status depends on the current ContainerState. 有關pod容器狀態的詳細資訊,請參閱pod status和containerstatus。請注意,報告為POD狀態的資訊取決於當前的容器狀態。

Container States

Once Pod is assigned to a node by scheduler, kubelet starts creating containers using container runtime.There are three possible states of containers: Waiting, Running and Terminated. To check state of container, you can use kubectl describe pod [POD_NAME]. State is displayed for each container within that Pod. 一旦pod被排程器分配給一個節點,kubelet就開始使用container runtime建立容器。容器有三種可能的狀態:等待、執行和終止。要檢查容器的狀態,可以使用kubectl describe pod[pod_name]。顯示該艙內每個容器的狀態。

  • Waiting: Default state of container. If container is not in either Running or Terminated state, it is in Waiting state. A container in Waiting state still runs its required operations, like pulling images, applying Secrets, etc. Along with this state, a message and reason about the state are displayed to provide more information. 容器的預設狀態。如果容器未處於執行或終止狀態,則它處於等待狀態。處於等待狀態的容器仍在執行其所需的操作,如提取映象、加密配置檔案等。在該狀態下,將顯示有關該狀態的訊息和原因,以提供更多資訊。

    ...
    State:          Waiting
     Reason:       ErrImagePull
    ...
  • Running: Indicates that the container is executing without issues. Once a container enters into Running, postStart hook (if any) is executed. This state also displays the time when the container entered Running state. 指示容器正在無問題地執行。一旦容器進入執行狀態,就會執行“poststart”鉤子(如果有的話)。此狀態還顯示容器進入執行狀態的時間。

   ...
      State:          Running
       Started:      Wed, 30 Jan 2019 16:46:38 +0530
   ...
  • Terminated: Indicates that the container completed its execution and has stopped running. A container enters into this when it has successfully completed execution or when it has failed for some reason. Regardless, a reason and exit code is displayed, as well as the container’s start and finish time. Before a container enters into Terminated, preStop hook (if any) is executed. 指示容器已完成其執行並已停止執行。容器在成功完成執行或由於某種原因失敗時進入此狀態。無論如何,將顯示原因和退出程式碼,以及容器的開始和結束時間。在容器進入終止狀態之前,將執行“prestop”鉤子(如果有的話)。
   ...
      State:          Terminated
        Reason:       Completed
        Exit Code:    0
        Started:      Wed, 30 Jan 2019 11:45:26 +0530
        Finished:     Wed, 30 Jan 2019 11:45:26 +0530
    ...

Pod readiness gate

FEATURE STATE: Kubernetes v1.14 stable

In order to add extensibility to Pod readiness by enabling the injection of extra feedback or signals into PodStatus, Kubernetes 1.11 introduced a feature named Pod ready++. You can use the new field ReadinessGate in the PodSpec to specify additional conditions to be evaluated for Pod readiness. If Kubernetes cannot find such a condition in the status.conditions field of a Pod, the status of the condition is default to “False”. Below is an example: 為了通過向podstatus注入額外的反饋或訊號來增加pod就緒性的可擴充套件性,kubernetes 1.11引入了一個名為pod ready++的特性。您可以使用podspec中的新欄位readinessgate來指定要評估pod就緒性的附加條件。如果kubernetes在pod的status.conditions欄位中找不到這樣的條件,則該條件的狀態預設為“false”。下面是一個例子:

Kind: Pod
...
spec:
  readinessGates:
    - conditionType: "www.example.com/feature-1"
status:
  conditions:
    - type: Ready  # this is a builtin PodCondition
      status: "False"
      lastProbeTime: null
      lastTransitionTime: 2018-01-01T00:00:00Z
    - type: "www.example.com/feature-1"   # an extra PodCondition
      status: "False"
      lastProbeTime: null
      lastTransitionTime: 2018-01-01T00:00:00Z
  containerStatuses:
    - containerID: docker://abcd...
      ready: true
...

The new Pod conditions must comply with Kubernetes label key format. Since the kubectl patch command still doesn’t support patching object status, the new Pod conditions have to be injected through the PATCH action using one of the KubeClient libraries.新的pod條件必須符合kubernetes標籤金鑰格式。由於kubectl patch命令仍然不支援修補物件狀態,因此必須使用kubeclient庫之一通過修補操作注入新的pod條件。

With the introduction of new Pod conditions, a Pod is evaluated to be ready only when both the following statements are true:引入新的POD條件後,只有當以下兩個語句都為真時,才評估POD是否準備就緒:

  • All containers in the Pod are ready.
  • All conditions specified in ReadinessGates are “True”.

To facilitate this change to Pod readiness evaluation, a new Pod condition ContainersReady is introduced to capture the old Pod Ready condition.為了便於對吊艙準備狀態的評估,引入了一個新的pod狀態containers ready來捕獲舊的吊艙準備狀態。

In K8s 1.11, as an alpha feature, the “Pod Ready++” feature has to be explicitly enabled by setting the PodReadinessGates feature gate to true.

In K8s 1.12, the feature is enabled by default.

Restart policy

A PodSpec has a restartPolicy field with possible values Always, OnFailure, and Never. The default value is Always. restartPolicy applies to all Containers in the Pod. restartPolicy only refers to restarts of the Containers by the kubelet on the same node. Exited Containers that are restarted by the kubelet are restarted with an exponential back-off delay (10s, 20s, 40s …) capped at five minutes, and is reset after ten minutes of successful execution. As discussed in the Pods document, once bound to a node, a Pod will never be rebound to another node. podspec有一個restartpolicy欄位,其中可能有always、onfailure和never值。預設值始終為。restartpolicy適用於pod中的所有容器。restartpolicy僅指kubelet在同一節點上重新啟動容器。由kubelet重新啟動的已退出容器將以指數後退延遲(10s、20s、40s…)重新啟動,上限為5分鐘,並在成功執行10分鐘後重置。正如pods文件中所討論的,一旦繫結到一個節點,pod將永遠不會反彈到另一個節點。

Pod lifetime

In general, Pods do not disappear until someone destroys them. This might be a human or a controller. The only exception to this rule is that Pods with a phase of Succeeded or Failed for more than some duration (determined by terminated-pod-gc-threshold in the master) will expire and be automatically destroyed.
一般來說,pod不會消失,直到有人摧毀它們。這可能是人或控制器。此規則的唯一例外是,階段為“成功”或“失敗”的pod超過一段時間(由主機中終止的pod gc閾值確定)將過期並自動銷燬。

Three types of controllers are available:

  • Use a Job for Pods that are expected to terminate, for example, batch computations. Jobs are appropriate only for Pods with restartPolicy equal to OnFailure or Never. 對預期終止的pod使用作業,例如批處理計算。作業僅適用於restartpolicy等於onfailure或never的pod。
  • Use a ReplicationController, ReplicaSet, or Deployment for Pods that are not expected to terminate, for example, web servers. ReplicationControllers are appropriate only for Pods with a restartPolicy of Always.
  • Use a DaemonSet for Pods that need to run one per machine, because they provide a machine-specific system service.

All three types of controllers contain a PodTemplate. It is recommended to create the appropriate controller and let it create Pods, rather than directly create Pods yourself. That is because Pods alone are not resilient to machine failures, but controllers are. 所有三種型別的控制器都包含一個PODE模板。建議建立適當的控制器並讓它建立POD,而不是直接建立POD。這是因為豆莢本身對機器故障沒有彈性,但控制器是有彈性的。

If a node dies or is disconnected from the rest of the cluster, Kubernetes applies a policy for setting the phase of all Pods on the lost node to Failed. 如果一個節點問題或與叢集的其餘部分斷開連線,kubernetes將應用一個策略,將丟失節點上所有pod的phase設定為Failed。

Examples

Advanced liveness probe example 高階存活探測例子

活性探測由kubelet執行,因此所有請求都在kubelet網路名稱空間中發出。

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http
spec:
  containers:
  - args:
    - /server
    image: k8s.gcr.io/liveness
    livenessProbe:
      httpGet:
        # when "host" is not defined, "PodIP" will be used
        # host: my-host
        # when "scheme" is not defined, "HTTP" scheme will be used. Only "HTTP" and "HTTPS" are allowed
        # scheme: HTTPS
        path: /healthz
        port: 8080
        httpHeaders:
        - name: X-Custom-Header
          value: Awesome
      initialDelaySeconds: 15
      timeoutSeconds: 1
    name: liveness

Example states

  • Pod is running and has one Container. Container exits with success.

    • Log completion event.

    • If restartPolicy is:

    • Always: Restart Container; Pod phase stays Running.

    • OnFailure: Pod phase becomes Succeeded.

    • Never: Pod phase becomes Succeeded.

  • Pod is running and has one Container. Container exits with failure.

    • Log failure event.

    • If restartPolicy is:

    • Always: Restart Container; Pod phase stays Running.

    • OnFailure: Restart Container; Pod phase stays Running.

    • Never: Pod phase becomes Failed.

  • Pod is running and has two Containers. Container 1 exits with failure.

    • Log failure event.

    • If restartPolicy is:

    • Always: Restart Container; Pod phase stays Running.

    • OnFailure: Restart Container; Pod phase stays Running.

    • Never: Do not restart Container; Pod phase stays Running.

    • If Container 1 is not running, and Container 2 exits:

    • Log failure event.

    • If restartPolicy is:

      • Always: Restart Container; Pod phase stays Running.
      • OnFailure: Restart Container; Pod phase stays Running.
      • Never: Pod phase becomes Failed.
  • Pod is running and has one Container. Container runs out of memory.

    • Container terminates in failure.

    • Log OOM event.

    • If restartPolicy is:

    • Always: Restart Container; Pod phase stays Running.

    • OnFailure: Restart Container; Pod phase stays Running.

    • Never: Log failure event; Pod phase becomes Failed.

  • Pod is running, and a disk dies.

    • Kill all Containers.
    • Log appropriate event.
    • Pod phase becomes Failed.
    • If running under a controller, Pod is recreated elsewhere.
  • Pod is running, and its node is segmented out.

    • Node controller waits for timeout.
    • Node controller sets Pod phase to Failed.
    • If running under a controller, Pod is recreated elsewhere.

What's next

Feedback

Was this page helpful?

Yes

本作品採用《CC 協議》,轉載必須註明作者和本文連結