Go 語言的分散式讀寫互斥

oschina發表於2015-05-05

Go語言預設的 sync.RWMutex 實現在多核環境中表現並不佳，因為所有的讀者在進行原子增量操作時，會搶佔相同的記憶體地址。該文探討了一種 n-way RWMutex，也可以稱為“大讀者(big reader)”鎖，它可以為每個 CPU 核心分配獨立的 RWMutex。讀者僅需在其核心中處理讀鎖，而寫者則須依次處理所有鎖。

查詢當前 CPU

讀者使用 CPUID 指令來決定使用何種鎖，該指令僅需返回當前活動 CPU 的 APICID，而不需要發出系統呼叫指令抑或改變執行時。這在 Intel 或 AMD 處理器上均是可以的；ARM 處理器則需要使用 CPU ID 暫存器。對於超過 256 個處理器的系統，必須使用 x2APIC, 另外除了 CPUID 還要用到帶有EAX=0xb 的 EDX 暫存器。程式啟動時，會構建(通過 CPU 親和力系統呼叫) APICID 到 CPU 索引的對映，該對映在處理器的整個生命週期中靜態存在。由於 CPUID 指令的開銷可能相當昂貴，goroutine 將只在其執行的核心中定期地更新狀態結果。頻繁更新可以減少核心鎖阻塞，但同時也會導致花在加鎖過程中的 CPUID 指令時間增加。

陳舊的 CPU 資訊。如果加上鎖執行 goroutine 的 CPU 資訊可能會是過時的 (goroutine 會轉移到另一個核心)。在 reader 記住哪個是上鎖的前提下，這隻會影響效能，而不會影響準確性，當然，這樣的轉移也是不太可能的，就像作業系統核心嘗試在同一個核心保持執行緒來改進快取命中率一樣。

效能

這個模式的效能特徵會被大量的引數所影響。特別是 CPUID 檢測頻率，readers 的數量，readers 和 writers 的比率，還有 readers 持有鎖的時間，這些因素都非常重要。當在這個時間有且僅有一個 writer 活躍的時候，這個 writer 持有鎖的時期不會影響 sync.RWMutex 和 DRWMutex 之間的效能差異。

實驗證明DRWMutex表現勝過多核系統，特別writer小於1%的時候，CPUID會在最多每10個鎖之間被呼叫（這種變化取決於鎖被持有的持續時間）。甚至在少核的情況下，DRWMutex也在普遍選擇通過sync.Mutex使用sync.RWMutex的應用程式的情況下表現好過sync.RWMutex.

下圖顯示核數量使用增加每10個的平均效能：

drwmutex -i 5000 -p 0.0001 -w 1 -r 100 -c 100

錯誤條表示第25和第75個百分位。注意每第10核的下降；這是因為10個核組成一個執行標準檢查系統的機器上的NUMA節點，所以一旦增加一個NUMA節點，跨執行緒通訊量變得更加寶貴。對於DRWMutex來說，由於對比sync.RWMutex更多的reader能夠並行工作，所以效能也隨之提升。

檢視go-nuts tread進一步討論

cpu_amd64.s

#include "textflag.h"

// func cpu() uint64
TEXT 路cpu(SB),NOSPLIT,$0-8
	MOVL	$0x01, AX // version information
	MOVL	$0x00, BX // any leaf will do
	MOVL	$0x00, CX // any subleaf will do

	// call CPUID
	BYTE $0x0f
	BYTE $0xa2

	SHRQ	$24, BX // logical cpu id is put in EBX[31-24]
	MOVQ	BX, ret+0(FP)
	RET

main.go

package main

import (
	"flag"
	"fmt"
	"math/rand"
	"os"
	"runtime"
	"runtime/pprof"
	"sync"
	"syscall"
	"time"
	"unsafe"
)

func cpu() uint64 // implemented in cpu_amd64.s

var cpus map[uint64]int

// determine mapping from APIC ID to CPU index by pinning the entire process to
// one core at the time, and seeing that its APIC ID is.
func init() {
	cpus = make(map[uint64]int)

	var aff uint64
	syscall.Syscall(syscall.SYS_SCHED_GETAFFINITY, uintptr(0), unsafe.Sizeof(aff), uintptr(unsafe.Pointer(&aff)))

	n := 0
	start := time.Now()
	var mask uint64 = 1
Outer:
	for {
		for (aff & mask) == 0 {
			mask <<= 1
			if mask == 0 || mask > aff {
				break Outer
			}
		}

		ret, _, err := syscall.Syscall(syscall.SYS_SCHED_SETAFFINITY, uintptr(0), unsafe.Sizeof(mask), uintptr(unsafe.Pointer(&mask)))
		if ret != 0 {
			panic(err.Error())
		}

		// what CPU do we have?
		<-time.After(1 * time.Millisecond)
		c := cpu()

		if oldn, ok := cpus[c]; ok {
			fmt.Println("cpu", n, "==", oldn, "-- both have CPUID", c)
		}

		cpus[c] = n
		mask <<= 1
		n++
	}

	fmt.Printf("%d/%d cpus found in %v: %v/n", len(cpus), runtime.NumCPU(), time.Now().Sub(start), cpus)

	ret, _, err := syscall.Syscall(syscall.SYS_SCHED_SETAFFINITY, uintptr(0), unsafe.Sizeof(aff), uintptr(unsafe.Pointer(&aff)))
	if ret != 0 {
		panic(err.Error())
	}
}

type RWMutex2 []sync.RWMutex

func (mx RWMutex2) Lock() {
	for core := range mx {
		mx[core].Lock()
	}
}

func (mx RWMutex2) Unlock() {
	for core := range mx {
		mx[core].Unlock()
	}
}

func main() {
	cpuprofile := flag.Bool("cpuprofile", false, "enable CPU profiling")
	locks := flag.Uint64("i", 10000, "Number of iterations to perform")
	write := flag.Float64("p", 0.0001, "Probability of write locks")
	wwork := flag.Int("w", 1, "Amount of work for each writer")
	rwork := flag.Int("r", 100, "Amount of work for each reader")
	readers := flag.Int("n", runtime.GOMAXPROCS(0), "Total number of readers")
	checkcpu := flag.Uint64("c", 100, "Update CPU estimate every n iterations")
	flag.Parse()

	var o *os.File
	if *cpuprofile {
		o, _ := os.Create("rw.out")
		pprof.StartCPUProfile(o)
	}

	readers_per_core := *readers / runtime.GOMAXPROCS(0)

	var wg sync.WaitGroup

	var mx1 sync.RWMutex

	start1 := time.Now()
	for n := 0; n < runtime.GOMAXPROCS(0); n++ {
		for r := 0; r < readers_per_core; r++ {
			wg.Add(1)
			go func() {
				defer wg.Done()
				r := rand.New(rand.NewSource(rand.Int63()))
				for n := uint64(0); n < *locks; n++ {
					if r.Float64() < *write {
						mx1.Lock()
						x := 0
						for i := 0; i < *wwork; i++ {
							x++
						}
						_ = x
						mx1.Unlock()
					} else {
						mx1.RLock()
						x := 0
						for i := 0; i < *rwork; i++ {
							x++
						}
						_ = x
						mx1.RUnlock()
					}
				}
			}()
		}
	}
	wg.Wait()
	end1 := time.Now()

	t1 := end1.Sub(start1)
	fmt.Println("mx1", runtime.GOMAXPROCS(0), *readers, *locks, *write, *wwork, *rwork, *checkcpu, t1.Seconds(), t1)

	if *cpuprofile {
		pprof.StopCPUProfile()
		o.Close()

		o, _ = os.Create("rw2.out")
		pprof.StartCPUProfile(o)
	}

	mx2 := make(RWMutex2, len(cpus))

	start2 := time.Now()
	for n := 0; n < runtime.GOMAXPROCS(0); n++ {
		for r := 0; r < readers_per_core; r++ {
			wg.Add(1)
			go func() {
				defer wg.Done()
				c := cpus[cpu()]
				r := rand.New(rand.NewSource(rand.Int63()))
				for n := uint64(0); n < *locks; n++ {
					if *checkcpu != 0 && n%*checkcpu == 0 {
						c = cpus[cpu()]
					}

					if r.Float64() < *write {
						mx2.Lock()
						x := 0
						for i := 0; i < *wwork; i++ {
							x++
						}
						_ = x
						mx2.Unlock()
					} else {
						mx2[c].RLock()
						x := 0
						for i := 0; i < *rwork; i++ {
							x++
						}
						_ = x
						mx2[c].RUnlock()
					}
				}
			}()
		}
	}
	wg.Wait()
	end2 := time.Now()

	pprof.StopCPUProfile()
	o.Close()

	t2 := end2.Sub(start2)
	fmt.Println("mx2", runtime.GOMAXPROCS(0), *readers, *locks, *write, *wwork, *rwork, *checkcpu, t2.Seconds(), t2)
}

Go語言的互斥鎖Mutex
2020-11-02
GoMutex
Go語言中的互斥鎖和讀寫鎖（Mutex和RWMutex）
2020-11-03
GoMutex
Go 語言讀寫 Excel 文件
2018-11-27
GoExcel
Circuit: Go語言編寫的最小分散式程式設計式的作業系統
2015-04-18
UIGo分散式程式設計作業系統
Go 語言的原子操作和互斥鎖的區別
2020-06-12
Go
【轉】使用 Go 語言讀寫 Redis 協議
2019-08-16
GoRedis協議
owl - Go語言開發的分散式監控系統
2016-11-11
Go分散式
Go語言分散式系統配置管理實踐--go archaius
2018-10-23
Go分散式AI
[原創]分散式 Mutual Exclusion 演算法的 Go 語言 Demo
2018-05-15
分散式演算法Go
Go 語言函式
2022-03-25
Go函式
Go語言&&Redis 實現分散式鎖，你會不會？
2022-01-12
GoRedis分散式
Go語言之讀寫鎖
2021-09-09
Go
分散式技術中不可或缺的分散式互斥方案
2024-01-10
分散式
GO語言————6.1 函式
2018-06-30
Go函式
Go 語言 -make函式
2024-10-31
Go函式
假如用Go語言寫作文
2012-09-11
Go
《Go 語言程式設計》讀書筆記 (二）函式
2019-12-20
Go程式設計筆記函式
Go語言原子操作及互斥鎖，有什麼區別呢？
2022-01-19
Go
面試官：哥們Go語言的互斥鎖瞭解到什麼程度？
2022-06-27
面試Go
C語言lseek()函式：移動檔案的讀寫位置
2016-07-10
C語言函式
go語言學習-函式
2018-03-25
Go函式
C語言-檔案讀寫
2024-05-20
C語言
Go 語言實戰: 編寫可維護 Go 語言程式碼建議
2020-02-18
Go
帶讀 |《Go in Action》(中文：Go語言實戰)(一)
2022-12-15
Go
分散式互斥的高效容錯解決方案
2024-11-03
分散式
[譯] Go 語言實戰: 編寫可維護 Go 語言程式碼建議
2020-01-13
Go
[Go語言寫介面]一、使用xcgui完成go語言第一個軟體介面
2022-08-27
GoGUI
帶讀 |《Go in Action》(中文：Go語言實戰)語法和語言結構概覽 (二)
2022-12-22
Go
帶讀 |《Go in Action》(中文：Go語言實戰) 語法和語言結構概覽(三)
2022-12-22
Go
用Go語言寫HTTP中介軟體
2019-02-16
GoHTTP
Go 語言手寫本地 LRU 快取
2024-08-12
Go快取
函式 -- 就要學習 Go 語言
2019-01-23
函式Go
GO語言————6.5 內建函式
2018-06-30
Go函式
GO語言————6.6 遞迴函式
2018-06-30
Go遞迴函式
Go語言————1、初識GO語言
2018-06-23
Go
Go 語言併發程式設計之互斥鎖詳解 sync.Mutex
2024-09-29
Go程式設計Mutex
C 語言的互斥鎖、自旋鎖、原子操作
2023-01-11
go語言實戰課程《Go語言開發分散式任務排程輕鬆搞定高效能Crontab》——推薦分享
2019-08-18
Go分散式

Go 語言的分散式讀寫互斥

查詢當前 CPU

效能

相關文章