Implement Leader Election Algorithm With Go

suqinglee發表於2021-10-04

前言:剛剛做完6.824 Lab2A,寫篇文章整理一下。需要先看一下論文5.2小節,英文不太好理解,可以看這篇翻譯,如果看了中文翻譯還沒理解,可以看這個視訊,如果看了視訊還沒理解,也可以直接看下面的程式碼,有時候直接看程式碼反而更容易理解。

除了論文之外,還需要了解一定的Go語言,最起碼也要有其他語言的基礎。再一個要知道什麼是RPC?

本文脫離了MIT 6.824這門課程的背景,也脫離了Raft演算法的背景,只講了Leader Election演算法的實現,如果你不明白Leader Election可以用來做什麼,那麼需要了解容災、主備相關的知識。

本文的程式碼只列出了主幹,一些未定義的函式或資料成員,你可以通過名字或註釋來理解它的含義。

本文的程式碼並未體現併發同步,一個簡單的實現是對於每個函式使用一個大粒度的鎖,你也可以選用其餘的同步機制來實現效能更好併發同步。

// Current Node
type Server struct {
    // Leader | Candidate | Follower
    role    string
    // Logic clock
    term    int
    // Under whose leadership
    support int
    // All Nodes include current
    peers   []*rpcClient
    // Current peer's ID
    me      int
}

// Server's main loop
func (srv *Server) ticker() {}

// RPC Handler | Follower or Candidate get votes from other nodes
func (srv *Server) Canvass(args *cArgs, reply *cReply) {}

// RPC Handler | Leader declare sovereignty
func (srv *Server) Dominate(args *dArgs, reply *dReply) {}

func main() {
    srv := MakeServer("Follower")
    go srv.ticker()
}

上面的程式碼展示了分散式系統中某個節點的實現,對於一個節點而言,任意時刻有且只有一個身份(Leader、Candiate、Follower),其中Candidate是Follower到Leader的一箇中間身份。

初始所有節點都是Follower,當超過一段時間(Election Timeout)感受不到Leader的統治後,Follower會轉換為Candidate,並向其餘Follower拉票,當票數足夠時(半數以上),Candidate會轉換為Leader。

Election Timeout應當隨機產生,避免多個Candidate同時出現,這樣每個Candidate可能都無法獲得足夠的選票,需要新一輪的選舉。

func (srv *Server) ticker() {
    for srv.alive() {
        time.Sleep(1 * time.Millisecond)
        if srv.role == "Leader" {
            // 10 times per second
            if time.Since(srv.lastBroadcastHeartbeat) > srv.broadcastInterval() {
                srv.broadcast()
            }
        } else {
            // random election timeout about 300 ~ 400ms
            if time.Since(srv.lastReceiveHeartbeat) > srv.electionTimeout() {
                srv.elect()
            }
        }
    }
}

func (srv *Server) broadcast() {
    srv.lastBroadcastHeartbeat = time.Now()
    for id, peer := range srv.peers {
        if id == srv.me {
            continue
        }
        reply := dReply{}
        peer.Call("Server.Dominate", &dArgs{}, &reply)
    }
}

func (srv *Server) Dominate(args *dArgs, reply *dReply) {
    srv.lastReceiveHeartbeat = time.Now()
}

func (srv *Server) elect() {
    // reset avoid elect again
    srv.lastReceiveHeartbeat = time.Now()
    srv.role = "Candidate"

    voteCount := 1    // vote for me
    for id, peer := range srv.peers {
        if id == srv.me {
            continue
        }
        reply := cReply{}
        peer.Call("Server.Canvass", &cArgs{CandidateId: srv.me}, &reply)
        if reply.VoteGranted {
            voteCount++
        }
    }

    if voteCount > len(peers)/2 {
        srv.role = "Leader"
        srv.lastBroadcastHeartbeat = time.Unix(0, 0)
    }
}

func (srv *Server) Canvass(args *cArgs, reply *cReply) {
    reply.VoteGranted = false
    // only have one vote
    if srv.support == -1 {
        srv.support = args.CandiateId
        reply.VoteGranted = true
        srv.lastReceiveHeartbeat = time.Now()
    }
}

你應該注意到Server.term並未被使用,我有意的移除了和任期相關的程式碼,目的是為了更好的理解Leader Election演算法的基本框架。

對於整個系統來說,Leader應當有且只有一個,但也有例外,當系統內產生了網路隔離,每個分割槽會自成系統,擁有半數節點以上的分割槽會產生新的Leader,而舊的Leader會在自己的分割槽內一直存在,當網路隔離消失,就會出現兩個Leader同時出現的情況。

任期解決了這一問題。新Leader的任期會比舊Leader的任期大,這樣兩個Leader相遇後,任期更小的Leader轉換為Follower即可解決。

任期即邏輯時鐘,這裡使用一個遞增整型值表示。任期的遞增只發生在Leader死亡後,所有候選者拿到新的任期(任期產生),並在上任後將新的任期同步給所有節點(任期生效)。

在任期產生,即選舉階段內,每個節點不論角色,只能擁護(support)一個Candidate,任期生效後,即Leader上任後,support全部置為-1。

func (srv *Server) broadcast() {
    srv.lastBroadcastHeartbeat = time.Now()
    for id, peer := range srv.peers {
        if id == srv.me {
            continue
        }
        reply := dReply{}
        peer.Call("Server.Dominate", &dArgs{Term: srv.term}, &reply)
        if reply.Term > srv.term {
            // A new Leader appeared, update self state
            srv.role = "Follower"
            srv.term = reply.Term
            srv.support = -1
        }
    }
}

func (srv *Server) Dominate(args *dArgs, reply *dReply) {
    reply.Term = srv.term
    if args.Term < srv.term {
        return    // ignore old Leader
    }
    // A new Leader appeared, update self state
    if args.Term > srv.term {
        srv.term = args.Term
        srv.role = "Follower"
        srv.support = -1
    }
    srv.lastReceiveHeartbeat = time.Now()
}

func (srv *Server) elect() {
    // Reset avoid elect again
    srv.lastReceiveHeartbeat = time.Now()
    srv.role = "Candidate"
    // Update term when old leader dead
    srv.term++
    // vote for self
    srv.support = srv.me
    voteCount := 1

    maxTerm := 0
    for id, peer := range srv.peers {
        if id == srv.me {
            continue
        }
        reply := cReply{}
        peer.Call("Server.Canvass", &cArgs{Term: srv.term, CandidateId: srv.me}, &reply)
        if reply.VoteGranted {
            voteCount++
        }
        if reply.Term > maxTerm {
            maxTerm = reply.Term
        }
    }

    // The role may became Follower during canvass
    // That means another Leader appeared
    if srv.role != "Candidate" {
        return
    }
    // A new Leader appeared, update self state
    if maxTerm > srv.term {
        srv.role = "Follower"
        srv.term = maxTerm
        srv.support = -1
        return
    }
    if voteCount > len(peers)/2 {
        srv.role = "Leader"
        srv.support = -1
        srv.lastBroadcastHeartbeat = time.Unix(0, 0)
    }
}

func (srv *Server) Canvass(args *cArgs, reply *cReply) {
    reply.VoteGranted = false
    reply.Term = srv.term

    if args.Term < srv.term {
        return    // ignore old Candidate
    }
    // A new Leader appeared, update self state
    if args.Term > srv.term {
        srv.term = args.Term
        srv.role = "Follower"
        srv.support = -1
    }
    // only have one vote
    if srv.support == -1 {
        srv.support = args.CandiateId
        reply.VoteGranted = true
        srv.lastReceiveHeartbeat = time.Now()
    }
}

Term是Leader Election中的核心概念,Term、Role總是同時出現,演算法整體的執行正是依賴著Term、Role這兩個狀態的轉換,希望這篇文章能夠對你有所幫助。

本作品採用《CC 協議》,轉載必須註明作者和本文連結

相關文章