上一篇文章《當我們談論鎖,我們談什麼》 中我提到了鎖,準確地說是訊號量(semaphore, mutext是semaphore的一種)的實現方式有兩種:wait的時候忙等待或者阻塞自己。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
//忙等待 wait(S) { while(S<=0) ; //no-op S-- } //阻塞 wait(semaphore *S) { S->value--; if (S->value < 0) { add this process to S->list; block() } } |
忙等待和阻塞方式各有優劣:
- 忙等待會使CPU空轉,好處是如果在當前時間片內鎖被其他程式釋放,當前程式直接就能拿到鎖而不需要CPU進行程式排程了。適用於鎖佔用時間較短的情況,且不適合於單處理器。
- 阻塞不會導致CPU空轉,但是程式切換也需要代價,比如上下文切換,CPU Cache Miss。
下面看一下golang的原始碼裡面是怎麼實現鎖的。golang裡面的鎖有兩個特性:
1.不支援巢狀鎖
2.可以一個goroutine lock,另一個goroutine unlock
互斥鎖
golang中的互斥鎖定義在src/sync/mutex.go
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
// A Mutex is a mutual exclusion lock. // Mutexes can be created as part of other structures; // the zero value for a Mutex is an unlocked mutex. // // A Mutex must not be copied after first use. type Mutex struct { state int32 sema uint32 } const ( mutexLocked = 1 << iota // mutex is locked mutexWoken mutexWaiterShift = iota ) |
看上去也是使用訊號量的方式來實現的。sema就是訊號量,一個非負數;state表示Mutex的狀態。mutexLocked表示鎖是否可用(0可用,1被別的goroutine佔用),mutexWoken=2表示mutex是否被喚醒,mutexWaiterShift=2表示統計阻塞在該mutex上的goroutine數目需要移位的數值。將3個常量對映到state上就是
1 2 3 4 5 6 7 8 |
state: |32|31|...|3|2|1| \__________/ | | | | | | | mutex的佔用狀態(1被佔用,0可用) | | | mutex的當前goroutine是否被喚醒 | 當前阻塞在mutex上的goroutine數 |
1.Lock
下面看一下mutex的lock。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
func (m *Mutex) Lock() { // Fast path: grab unlocked mutex. if atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked) { if race.Enabled { race.Acquire(unsafe.Pointer(m)) } return } awoke := false iter := 0 for { old := m.state new := old | mutexLocked if old&mutexLocked != 0 { if runtime_canSpin(iter) { // Active spinning makes sense. // Try to set mutexWoken flag to inform Unlock // to not wake other blocked goroutines. if !awoke && old&mutexWoken == 0 && old>>mutexWaiterShift != 0 && atomic.CompareAndSwapInt32(&m.state, old, old|mutexWoken) { awoke = true } runtime_doSpin() iter++ continue } new = old + 1<<mutexWaiterShift } if awoke { // The goroutine has been woken from sleep, // so we need to reset the flag in either case. if new&mutexWoken == 0 { panic("sync: inconsistent mutex state") } new &^= mutexWoken } if atomic.CompareAndSwapInt32(&m.state, old, new) { if old&mutexLocked == 0 { break } runtime_Semacquire(&m.sema) awoke = true iter = 0 } } if race.Enabled { race.Acquire(unsafe.Pointer(m)) } } |
這裡要解釋一下atomic.CompareAndSwapInt32()
,atomic
包是由golang提供的low-level的原子操作封裝,主要用來解決程式同步為題,官方並不建議直接使用。我在上一篇文章中說過,作業系統級的鎖的實現方案是提供原子操作,然後基本上所有鎖相關都是通過這些原子操作來實現。CompareAndSwapInt32()
就是int32型數字的compare-and-swap
實現。cas(&addr, old, new)的意思是if *addr==old, *addr=new
。大部分作業系統支援CAS,x86指令集上的CAS彙編指令是CMPXCHG
。下面我們繼續看上面的lock函式。
1 2 3 4 5 6 |
if atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked) { if race.Enabled { race.Acquire(unsafe.Pointer(m)) } return } |
首先先忽略race.Enabled相關程式碼,這個是go做race檢測時候用的,這個時候需要帶上-race
,則race.Enabled被置為true。Lock函式的入口處先呼叫CAS嘗試去獲得鎖,如果m.state==0,則將其置為1,並返回。
繼續往下看,首先將m.state的值儲存到old變數中,new=old|mutexLocked。直接看能讓for退出的第三個if條件,首先呼叫CAS試圖將m.state設定成new的值。然後看一下if裡面,如果m.state之前的值也就是old如果沒有被佔用則表示當前goroutine拿到了鎖,則break。我們先看一下new的值的變化,第一個if條件裡面new = old + 1<<mutexWaiterShift
,結合上面的mutex的state各個位的意義,這句話的意思表示mutex的等待goroutine數目加1。還有awoke為true的情況下,要將m.state的標誌位取消掉,也就是這句new &^= mutexWoken
的作用。繼續看第三個if條件裡面,如果裡面的if判斷失敗,則走到runtime_Semacquire()。
看一下這個函式runtime_Semacquire()
函式,由於golang1.5之後把之前C語言實現的程式碼都幹掉了,所以現在很低層的程式碼都是go來實現的。通過原始碼中的定義我們可以知道這個其實就是訊號量的wait操作:等待*s>0,然後減1。編譯器裡使用的是sync_runtime.semacquire()
函式。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
// Semacquire waits until *s > 0 and then atomically decrements it. // It is intended as a simple sleep primitive for use by the synchronization // library and should not be used directly. func runtime_Semacquire(s *uint32) //go:linkname sync_runtime_Semacquire sync.runtime_Semacquire func sync_runtime_Semacquire(addr *uint32) { semacquire(addr, true) } func semacquire(addr *uint32, profile bool) { gp := getg() if gp != gp.m.curg { throw("semacquire not on the G stack") } // Easy case. if cansemacquire(addr) { return } // Harder case: // increment waiter count // try cansemacquire one more time, return if succeeded // enqueue itself as a waiter // sleep // (waiter descriptor is dequeued by signaler) s := acquireSudog() root := semroot(addr) t0 := int64(0) s.releasetime = 0 if profile && blockprofilerate > 0 { t0 = cputicks() s.releasetime = -1 } for { lock(&root.lock) // Add ourselves to nwait to disable "easy case" in semrelease. atomic.Xadd(&root.nwait, 1) // Check cansemacquire to avoid missed wakeup. if cansemacquire(addr) { atomic.Xadd(&root.nwait, -1) unlock(&root.lock) break } // Any semrelease after the cansemacquire knows we're waiting // (we set nwait above), so go to sleep. root.queue(addr, s) goparkunlock(&root.lock, "semacquire", traceEvGoBlockSync, 4) if cansemacquire(addr) { break } } if s.releasetime > 0 { blockevent(s.releasetime-t0, 3) } releaseSudog(s) } |
上面的程式碼有點多,我們只看和鎖相關的程式碼。
1 2 3 4 5 |
root := semroot(addr) //seg 1 atomic.Xadd(&root.nwait, 1) // seg 2 root.queue(addr, s) //seg 3 |
seg 1程式碼片段semroot()返回結構體semaRoot。儲存方式是先對訊號量的地址做移位,然後做雜湊(對251取模,這個地方為什麼是左移3位和對251取模不太明白)。semaRoot相當於和mutex.sema繫結。看一下semaRoot的結構:一個sudog連結串列和一個nwait整型欄位。nwait欄位表示該訊號量上等待的goroutine數目。head和tail表示連結串列的頭和尾巴,同時為了執行緒安全,需要使用一個互斥量來保護連結串列。這個時候細心的同學應該注意到一個問題,我們前面不是從Mutex跟過來的嗎,相當於Mutex的實現了使用了Mutex本身?實際上semaRoot裡面的mutex只是內部使用的一個簡單版本,和sync.Mutex不是同一個。現在把這些倒推回去,runtime_Semacquire()
的作用其實就是semaphore的wait(&s):如果*s<0,則將當前goroutine塞入訊號量s關聯的goroutine waiting list,並休眠。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
func semroot(addr *uint32) *semaRoot { return &semtable[(uintptr(unsafe.Pointer(addr))>>3)%semTabSize].root } type semaRoot struct { lock mutex head *sudog tail *sudog nwait uint32 // Number of waiters. Read w/o the lock. } // Prime to not correlate with any user patterns. const semTabSize = 251 var semtable [semTabSize]struct { root semaRoot pad [sys.CacheLineSize - unsafe.Sizeof(semaRoot{})]byte } |
現在mutex.Lock()
還剩下runtime_canSpin(iter)
這一段,這個地方其實就是鎖的自旋版本。golang對於自旋鎖的取捨做了一些限制:1.多核; 2.GOMAXPROCS>1; 3.至少有一個執行的P並且local的P佇列為空。golang的自旋嘗試只會做幾次,並不會一直嘗試下去,感興趣的可以跟一下原始碼。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
func sync_runtime_canSpin(i int) bool { // sync.Mutex is cooperative, so we are conservative with spinning. // Spin only few times and only if running on a multicore machine and // GOMAXPROCS>1 and there is at least one other running P and local runq is empty. // As opposed to runtime mutex we don't do passive spinning here, // because there can be work on global runq on on other Ps. if i >= active_spin || ncpu <= 1 || gomaxprocs <= int32(sched.npidle+sched.nmspinning)+1 { return false } if p := getg().m.p.ptr(); !runqempty(p) { return false } return true } func sync_runtime_doSpin() { procyield(active_spin_cnt) } |
Unlock
Mutex的Unlock函式定義如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
// Unlock unlocks m. // It is a run-time error if m is not locked on entry to Unlock. // // A locked Mutex is not associated with a particular goroutine. // It is allowed for one goroutine to lock a Mutex and then // arrange for another goroutine to unlock it. func (m *Mutex) Unlock() { if race.Enabled { _ = m.state race.Release(unsafe.Pointer(m)) } // Fast path: drop lock bit. new := atomic.AddInt32(&m.state, -mutexLocked) if (new+mutexLocked)&mutexLocked == 0 { panic("sync: unlock of unlocked mutex") } old := new for { // If there are no waiters or a goroutine has already // been woken or grabbed the lock, no need to wake anyone. if old>>mutexWaiterShift == 0 || old&(mutexLocked|mutexWoken) != 0 { return } // Grab the right to wake someone. new = (old - 1<<mutexWaiterShift) | mutexWoken if atomic.CompareAndSwapInt32(&m.state, old, new) { runtime_Semrelease(&m.sema) return } old = m.state } } |
函式入口處的四行程式碼和race detection相關,暫時不用管。接下來的四行程式碼是判斷是否是巢狀鎖。new是m.state-1之後的值。我們重點看for迴圈內部的程式碼。
1 2 3 |
if old>>mutexWaiterShift == 0 || old&(mutexLocked|mutexWoken) != 0 { return } |
這兩句是說:如果阻塞在該鎖上的goroutine數目為0或者mutex處於lock或者喚醒狀態,則返回。
1 2 3 4 5 |
new = (old - 1<<mutexWaiterShift) | mutexWoken if atomic.CompareAndSwapInt32(&m.state, old, new) { runtime_Semrelease(&m.sema) return } |
這裡先將阻塞在mutex上的goroutine數目減一,然後將mutex置於喚醒狀態。runtime_Semrelease
和runtime_Semacquire
的作用剛好相反,將阻塞在訊號量上goroutine喚醒。有人可能會問喚醒的是哪個goroutine,那麼我們可以看一下goroutine wait list的入佇列和出佇列程式碼。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
func (root *semaRoot) queue(addr *uint32, s *sudog) { s.g = getg() s.elem = unsafe.Pointer(addr) s.next = nil s.prev = root.tail if root.tail != nil { root.tail.next = s } else { root.head = s } root.tail = s } func (root *semaRoot) dequeue(s *sudog) { if s.next != nil { s.next.prev = s.prev } else { root.tail = s.prev } if s.prev != nil { s.prev.next = s.next } else { root.head = s.next } s.elem = nil s.next = nil s.prev = nil } |
如上所示,wait list入隊是插在隊尾,出隊是從頭出。
參考
- 《Go語言學習筆記》
打賞支援我寫出更多好文章,謝謝!
打賞作者
打賞支援我寫出更多好文章,謝謝!
任選一種支付方式