深入分析kube-batch（4）——actions

weixin_34292287發表於2018-10-18

原文網址 : https://blog.csdn.net/weixin_34292287/article/details/86839970

深入分析kube-batch（4）——actions

action是真正的排程過程，順序是reclaim -> allocate -> backfill -> preempt

interface

// Action is the interface of scheduler action.
type Action interface {
   // The unique name of Action.
   Name() string

   // Initialize initializes the allocator plugins.
   Initialize()

   // Execute allocates the cluster's resources into each queue.
   Execute(ssn *Session)

   // UnIntialize un-initializes the allocator plugins.
   UnInitialize()
}

重點關注Execute實現

reclaim

kube-batch\pkg\scheduler\actions\reclaim\reclaim.go

func (alloc *reclaimAction) Execute(ssn *framework.Session) {
   queues := util.NewPriorityQueue(ssn.QueueOrderFn)

   preemptorsMap := map[api.QueueID]*util.PriorityQueue{}
   preemptorTasks := map[api.JobID]*util.PriorityQueue{}

   for _, job := range ssn.Jobs {
      queues.Push(queue)

      if len(job.TaskStatusIndex[api.Pending]) != 0 {
         preemptorsMap[job.Queue].Push(job)
         
         for _, task := range job.TaskStatusIndex[api.Pending] {
            preemptorTasks[job.UID].Push(task)
         }
      }
   }

根據優先順序排序queue
將待排程的task儲存為搶佔者

for {
   if queues.Empty() {
      break
   }

   queue := queues.Pop().(*api.QueueInfo)
   jobs, found := preemptorsMap[queue.UID]
   tasks, found := preemptorTasks[job.UID]

   resreq := task.Resreq.Clone()
   reclaimed := api.EmptyResource()

   assigned := false

   for _, n := range ssn.Nodes {
      if err := ssn.PredicateFn(task, n); err != nil {
         continue
      }

      var reclaimees []*api.TaskInfo
      for _, task := range n.Tasks {
         if task.Status != api.Running {
            continue
         }

         reclaimees = append(reclaimees, task.Clone())
      }
      victims := ssn.Reclaimable(task, reclaimees)

      if len(victims) == 0 {
         continue
      }

      // If not enough resource, continue
      allRes := api.EmptyResource()
      for _, v := range victims {
         allRes.Add(v.Resreq)
      }
      if allRes.Less(resreq) {
         continue
      }

      // Reclaim victims for tasks.
      for _, reclaimee := range victims {
         ssn.Evict(reclaimee, "reclaim")
         
         reclaimed.Add(reclaimee.Resreq)
         if resreq.LessEqual(reclaimee.Resreq) {
            break
         }
         resreq.Sub(reclaimee.Resreq)
      }       
      break
   }

}

找到優先順序最高的queue，job，task
遍歷node，首先過預選函式，很奇怪，沒有PodFitsResources，應該是kube-batch自己管理資源
找到node上正在執行的pod
找到受害者
如果受害者資源總量小於pod申請資源總量，就跳過
驅逐受害者，呼叫刪除介面
如果釋放足夠的資源，就跳出驅逐

reclaim過程目前還沒遇到過，回收函式也不是很理解。我覺得回收不是很必要，驅逐邏輯不應該在這裡做，kubelet已經有了驅逐邏輯，不是很明白reclaim的必要性。而且不是每次都需要回收，應該判斷node是否自願不足。我會在配置中移除reclaim的action，還能提高效能。

allocate

        for !tasks.Empty() {
            task := tasks.Pop().(*api.TaskInfo)

            for _, node := range ssn.Nodes {
                if err := ssn.PredicateFn(task, node); err != nil {
                    continue
                }

                // Allocate idle resource to the task.
                if task.Resreq.LessEqual(node.Idle) {
                    ssn.Allocate(task, node.Name)
                    break
                }
            }
        }

分配過程只看最核心的部分，

過一遍預選函式，
比較pod資源申請和node空閒資源
bind

這裡解決了上面的疑問，預選函式中沒有PodFitsResources，是因為在這裡實現了類似功能；不過又多了一個疑問，這裡如果node滿足pod要求，那麼就直接bind了？沒有優選過程嗎？那soft親和性怎麼辦？

backfill

func (alloc *backfillAction) Execute(ssn *framework.Session) {

   for _, job := range ssn.Jobs {
      for _, task := range job.TaskStatusIndex[api.Pending] {
         
         if task.Resreq.IsEmpty() {
            for _, node := range ssn.Nodes {
               ssn.PredicateFn(task, node);

               ssn.Allocate(task, node.Name)
               break
            }
         }
      }
   }
}

backfill是為了處理BestEffort後加的action，相關issue。

preempt

搶佔邏輯一直沒有很理解，這裡先暫時不分析了，會另外開一篇文章專門介紹搶佔，不過kube-batch的搶佔跟K8S的不太一樣。

總結

學習了馬達老師的kube-batch為我開啟了一個新思路，不過對於我的專案可能有點重，不是很需要回收和搶佔邏輯，還有貌似也不支援優選邏輯，沒有優選就沒有soft親和性，這個是個非常致命的問題，所以後期應該準備參考default-scheduler和kube-batch自己寫一個排程器了。

MySQL原理簡介—4.深入分析Buffer Pool
2024-11-24
MySql
Github Actions 初探
2019-02-17
Github
GitHub Actions 入門指南
2024-01-17
Github
GitHub Actions 入門教程
2019-09-12
Github
強大的Github Actions
2019-11-01
Github
github actions自動部署
2021-04-04
Github
Python MetaClass深入分析
2018-10-26
Python
深入分析KubernetesCriticalPod（二）
2018-06-28
GitHub Actions 入門實踐
2020-11-11
Github
【CICID】GitHub-Actions語法
2024-06-16
Github
Github Actions Hexo 自動部署
2022-11-26
GithubHexo
GitHub Actions，臥槽！牛批！
2022-12-06
Github
GitHub Actions 支援 "skip ci" 了
2021-02-10
Github
CICD最簡實踐————github actions
2020-10-15
Github
Kali/Ubuntu Linux laptop lid close actions
2024-06-22
UbuntuLinuxAPT
使用 Github Actions 部署 VuePress 部落格
2021-05-31
GithubVue
Github Actions 還能做這些事
2021-03-29
Github
Github Actions 中 Service Container 的使用
2020-12-07
GithubAI
SPI機制深入分析
2018-11-12
深入分析 Hello World 程式
2018-05-19
深入分析C++引用
2018-06-03
C++
深入分析Session和Cookie
2018-08-21
SessionCookie
深入分析 Fiesta Exploit Kit
2020-08-19
深入分析 Golang 的 Error
2022-04-26
GolangError
Android動畫深入分析
2021-09-09
Android動畫
手撕Vuex-實現actions方法
2023-11-01
Vue
[Vuex系列] - Actions的理解之我見
2019-04-28
Vue
Github Actions 實戰提高生產力
2022-03-28
Github
利用github Actions釋出npm和release
2021-07-30
GithubNPM
使用 Ngrx ActionSubject 監聽 Dispatched NgRx Actions
2022-07-11
針不戳！GitHub Actions 入坑指南
2021-05-09
Github
CLSRSC-196: ACFS driver install actions failed
2020-12-03
AI
Redis API & Java RedisTemplate深入分析
2018-10-25
RedisAPIJava
深入分析 Javac 編譯原理
2018-09-17
Java編譯原理
MySQL latch爭用深入分析
2020-09-10
MySql
ATL Thunk機制深入分析
2019-03-21
深入分析 synchronized 關鍵字
2019-03-04
synchronized
深入分析HTTP代理的原理
2021-09-11
HTTP

深入分析kube-batch（4）——actions

深入分析kube-batch（4）——actions

interface

reclaim

allocate

backfill

preempt

總結

相關文章