victoriaMetrics中的一些Sao操作

charlieroro發表於2022-04-29

原文網址 : https://www.cnblogs.com/charlieroro/p/16195044.html

victoriaMetrics中的一些Sao操作

victoriaMetrics中的一些Sao操作

快速獲取當前時間

victoriaMetrics中有一個fasttime庫，用於快速獲取當前的Unix時間，實現其實挺簡單，就是在後臺使用一個goroutine不斷以1s為週期重新整理表示當前時間的變數currentTimestamp，獲取的時候直接原子載入該變數即可。其效能約是time.Now()的8倍。

其核心方式就是將主要任務放到後臺執行，通過一箇中間變數來傳遞運算結果，以此來通過非同步的方式提升效能，但需要業務能包容一定的精度偏差。

func init() {
	go func() {
		ticker := time.NewTicker(time.Second)
		defer ticker.Stop()
		for tm := range ticker.C { 
			t := uint64(tm.Unix())
			atomic.StoreUint64(&currentTimestamp, t)
		}
	}()
}

var currentTimestamp = uint64(time.Now().Unix())

// UnixTimestamp returns the current unix timestamp in seconds.
//
// It is faster than time.Now().Unix()
func UnixTimestamp() uint64 {
	return atomic.LoadUint64(&currentTimestamp)
}

計算結構體的雜湊值

hashUint64函式中使用xxhash.Sum64計算了結構體Key的雜湊值。通過unsafe.Pointer將指標轉換為*[]byte型別，byte陣列的長度為unsafe.Sizeof(*k)，unsafe.Sizeof()返回結構體的位元組大小。

如果一個資料為固定的長度，如h的型別為uint64，則可以直接指定長度為8進行轉換，如：bp:=([8]byte)(unsafe.Pointer(&h))

需要注意的是unsafe.Sizeof()返回的是資料結構的大小而不是其指向內容的資料大小，如下返回的slice大小為24，為slice首部資料結構SliceHeader的大小，而不是其引用的資料大小(可以使用len獲取slice引用的資料大小)。此外如果結構體中有指標，則轉換成的byte中儲存的也是指標儲存的地址。
slice := []int{1,2,3,4,5,6,7,8,9,10}
fmt.Println(unsafe.Sizeof(slice)) //24

type Key struct {
	Part interface{}
	Offset uint64
}

func (k *Key) hashUint64() uint64 {
	buf := (*[unsafe.Sizeof(*k)]byte)(unsafe.Pointer(k))
	return xxhash.Sum64(buf[:])
}

將字串新增到已有的[]byte中

使用如下方式即可：

str := "1231445"
arr := []byte{1, 2, 3}
arr = append(arr, str...)

將int64的陣列轉換為byte陣列

直接操作了底層的SliceHeader

func int64ToByteSlice(a []int64) (b []byte) {
   sh := (*reflect.SliceHeader)(unsafe.Pointer(&b))
   sh.Data = uintptr(unsafe.Pointer(&a[0]))
   sh.Len = len(a) * int(unsafe.Sizeof(a[0]))
   sh.Cap = sh.Len
   return
}

併發訪問的sync.WaitGroup

併發訪問的sync.WaitGroup的目的是為了在執行時新增需要等待的goroutine

// WaitGroup wraps sync.WaitGroup and makes safe to call Add/Wait
// from concurrent goroutines.
//
// An additional limitation is that call to Wait prohibits further calls to Add
// until return.
type WaitGroup struct {
	sync.WaitGroup
	mu sync.Mutex
}

// Add registers n additional workers. Add may be called from concurrent goroutines.
func (wg *WaitGroup) Add(n int) {
	wg.mu.Lock()
	wg.WaitGroup.Add(n)
	wg.mu.Unlock()
}

// Wait waits until all the goroutines call Done.
//
// Wait may be called from concurrent goroutines.
//
// Further calls to Add are blocked until return from Wait.
func (wg *WaitGroup) Wait() {
	wg.mu.Lock()
	wg.WaitGroup.Wait()
	wg.mu.Unlock()
}

// WaitAndBlock waits until all the goroutines call Done and then prevents
// from new goroutines calling Add.
//
// Further calls to Add are always blocked. This is useful for graceful shutdown
// when other goroutines calling Add must be stopped.
//
// wg cannot be used after this call.
func (wg *WaitGroup) WaitAndBlock() {
	wg.mu.Lock()
	wg.WaitGroup.Wait()

	// Do not unlock wg.mu, so other goroutines calling Add are blocked.
}

// There is no need in wrapping WaitGroup.Done, since it is already goroutine-safe.

時間池

高頻次建立timer會消耗一定的效能，為了減少某些情況下的效能損耗，可以使用sync.Pool來回收利用建立的timer

// Get returns a timer for the given duration d from the pool.
//
// Return back the timer to the pool with Put.
func Get(d time.Duration) *time.Timer {
	if v := timerPool.Get(); v != nil {
		t := v.(*time.Timer)
		if t.Reset(d) {
			logger.Panicf("BUG: active timer trapped to the pool!")
		}
		return t
	}
	return time.NewTimer(d)
}

// Put returns t to the pool.
//
// t cannot be accessed after returning to the pool.
func Put(t *time.Timer) {
	if !t.Stop() {
		// Drain t.C if it wasn't obtained by the caller yet.
		select {
		case <-t.C:
		default:
		}
	}
	timerPool.Put(t)
}

var timerPool sync.Pool

訪問限速

victoriaMetrics的vminsert作為vmagent和vmstorage之間的元件，接收vmagent的流量並將其轉發到vmstorage。在vmstorage卡死、處理過慢或下線的情況下，有可能會導致無法轉發流量，進而造成vminsert CPU和記憶體飆升，造成元件故障。為了防止這種情況，vminsert使用了限速器，當接收到的流量激增時，可以在犧牲一部分資料的情況下保證系統的穩定性。

victoriaMetrics的原始碼中對限速器有如下描述：

Limit the number of conurrent f calls in order to prevent from excess memory usage and CPU thrashing

限速器使用了兩個引數：maxConcurrentInserts和maxQueueDuration，前者給出了突發情況下可以處理的最大請求數，後者給出了某個請求的最大超時時間。需要注意的是Do(f func() error)是非同步執行的，而ch又是全域性的，因此會非同步等待其他請求釋放資源(struct{})。

可以看到限速器使用了指標來指示當前的限速狀態。同時使用cgroup.AvailableCPUs()*4 (即runtime.GOMAXPROCS(-1)*4)來設定預設的maxConcurrentInserts長度。

當該限速器用在處理如http請求時，該限速器並不能限制底層上送的請求，其限制的是對請求的處理。在高流量業務處理中，這也是最消耗記憶體的地方，通常包含資料讀取、記憶體申請拷貝等。底層的資料受/proc/sys/net/core/somaxconn和socket快取區的限制。

var (
	maxConcurrentInserts = flag.Int("maxConcurrentInserts", cgroup.AvailableCPUs()*4, "The maximum number of concurrent inserts. Default value should work for most cases, "+
		"since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration")
	maxQueueDuration = flag.Duration("insert.maxQueueDuration", time.Minute, "The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts")
)

// ch is the channel for limiting concurrent calls to Do.
var ch chan struct{}

// Init initializes concurrencylimiter.
//
// Init must be called after flag.Parse call.
func Init() {
	ch = make(chan struct{}, *maxConcurrentInserts) //初始化limiter，最大突發並行請求量為maxConcurrentInserts
}

// Do calls f with the limited concurrency.
func Do(f func() error) error {
	// Limit the number of conurrent f calls in order to prevent from excess
	// memory usage and CPU thrashing.
	select {
	case ch <- struct{}{}: //在channel中新增一個元素，表示開始處理一個請求
		err := f() //阻塞等大請求處理結束
		<-ch //請求處理完之後釋放channel中的一個元素，釋放出的空間可以用於處理下一個請求
		return err
	default:
	}

    //如果當前達到處理上限maxConcurrentInserts，則需要等到其他Do(f func() error)釋放資源。
	// All the workers are busy.
	// Sleep for up to *maxQueueDuration.
	concurrencyLimitReached.Inc()
	t := timerpool.Get(*maxQueueDuration) //獲取一個timer，設定等待超時時間為 maxQueueDuration
	select {
	case ch <- struct{}{}: //在maxQueueDuration時間內等待其他請求釋放資源，如果獲取到資源，則回收timer，繼續處理
		timerpool.Put(t)
		err := f()
		<-
		return err
	case <-t.C: //在maxQueueDuration時間內沒有獲取到資源，定時器超時後回收timer，丟棄請求並返回錯誤資訊
		timerpool.Put(t)
		concurrencyLimitTimeout.Inc()
		return &httpserver.ErrorWithStatusCode{
			Err: fmt.Errorf("cannot handle more than %d concurrent inserts during %s; possible solutions: "+
				"increase `-insert.maxQueueDuration`, increase `-maxConcurrentInserts`, increase server capacity", *maxConcurrentInserts, *maxQueueDuration),
			StatusCode: http.StatusServiceUnavailable,
		}
	}
}

var (
	concurrencyLimitReached = metrics.NewCounter(`vm_concurrent_insert_limit_reached_total`)
	concurrencyLimitTimeout = metrics.NewCounter(`vm_concurrent_insert_limit_timeout_total`)

	_ = metrics.NewGauge(`vm_concurrent_insert_capacity`, func() float64 {
		return float64(cap(ch))
	})
	_ = metrics.NewGauge(`vm_concurrent_insert_current`, func() float64 {
		return float64(len(ch))
	})
)

優先順序控制

victoriaMetrics的pacelimiter庫實現了優先順序控制。主要方法由Inc、Dec和WaitIfNeeded。低優先順序任務需要呼叫WaitIfNeeded方法，如果此時有高優先順序任務(呼叫Inc方法)，則低優先順序任務需要等待高優先順序任務結束(呼叫Dec方法)之後才能繼續執行。

// PaceLimiter throttles WaitIfNeeded callers while the number of Inc calls is bigger than the number of Dec calls.
//
// It is expected that Inc is called before performing high-priority work,
// while Dec is called when the work is done.
// WaitIfNeeded must be called inside the work which must be throttled (i.e. lower-priority work).
// It may be called in the loop before performing a part of low-priority work.
type PaceLimiter struct {
	mu          sync.Mutex
	cond        *sync.Cond
	delaysTotal uint64
	n           int32
}

// New returns pace limiter that throttles WaitIfNeeded callers while the number of Inc calls is bigger than the number of Dec calls.
func New() *PaceLimiter {
	var pl PaceLimiter
	pl.cond = sync.NewCond(&pl.mu)
	return &pl
}

// Inc increments pl.
func (pl *PaceLimiter) Inc() {
	atomic.AddInt32(&pl.n, 1)
}

// Dec decrements pl.
func (pl *PaceLimiter) Dec() {
	if atomic.AddInt32(&pl.n, -1) == 0 {
		// Wake up all the goroutines blocked in WaitIfNeeded,
		// since the number of Dec calls equals the number of Inc calls.
		pl.cond.Broadcast()
	}
}

// WaitIfNeeded blocks while the number of Inc calls is bigger than the number of Dec calls.
func (pl *PaceLimiter) WaitIfNeeded() {
	if atomic.LoadInt32(&pl.n) <= 0 {
		// Fast path - there is no need in lock.
		return
	}
	// Slow path - wait until Dec is called.
	pl.mu.Lock()
	for atomic.LoadInt32(&pl.n) > 0 {
		pl.delaysTotal++
		pl.cond.Wait()
	}
	pl.mu.Unlock()
}

// DelaysTotal returns the number of delays inside WaitIfNeeded.
func (pl *PaceLimiter) DelaysTotal() uint64 {
	pl.mu.Lock()
	n := pl.delaysTotal
	pl.mu.Unlock()
	return n
}

python-python的sao操作 map reduce filter
2018-07-18
PythonFilter
victoriaMetrics之byteBuffer
2022-04-06
js中陣列的一些常見操作 - 1
2018-03-09
JS陣列
Java開發中的幾種物件的說明(PO,VO,DTO,BO,POJO,DAO,SAO等)
2018-08-13
Java物件POJO
從原始碼徹底理解 Prometheus/VictoriaMetrics 中的 relabel/metric_configs 配置
2023-03-13
原始碼Prometheus
一些冷門的js操作
2018-04-10
JS
關於table的一些操作
2018-06-05
KeyShot操作的一些補充
2024-10-26
C++ vector 的一些操作
2024-09-19
C++
MySQL的一些操作記錄
2020-12-11
MySql
victoriaMetrics無法獲取抓取target的問題
2022-05-08
VictoriaMetrics常見效能問題排查
2023-05-11
TSDB - VictoriaMetrics 技術原理淺析
2023-04-02
簡記一些常用的操作指令
2019-08-20
《關於MySQL的一些騷操作》
2019-11-18
MySql
樹上的一些基礎操作
2024-07-08
Springboot整合Redis的一些常用操作
2021-07-02
Spring BootRedis
git的一些基礎操作命令
2020-12-10
Git
OkHttp 攔截器的一些騷操作
2018-05-16
HTTP
sql簡單入門的一些操作
2020-03-13
SQL
panda dataframe的一些常用操作方法
2020-11-04
Linux下一些操作的簡單整理
2020-04-06
Linux
git的一些常用操作和問題
2024-11-30
Git
GitHub、GitLab、Git 操作的一些規範
2021-07-20
GithubGitlab
victoriaMetrics庫之布隆過濾器
2022-04-05
過濾器
Prometheus +VictoriaMetrics+Consul+Granafa安裝部署
2025-01-20
Prometheus
Linux Bash 提示符的一些騷操作
2023-10-17
Linux
檔案 IO 操作的一些最佳實踐
2019-03-04
MySQL對錶和庫的一些基本操作
2018-08-09
MySql
13、c++有關string的一些操作
2024-09-17
C++
postgreSQL學習（二）:pgsql的一些基礎操作
2018-10-25
SQL
[IDE][IDEA]教你一些IDEA比較騷的操作
2019-03-28
Idea
一些免費、操作簡單的工具軟體
2018-10-03
win11安裝後一些有用的操作
2024-12-08
重新學習Java的第1天——一些普通的操作和java的一些知識
2020-12-01
Java
VictoriaMetrics 中文教程（10）叢集版介紹
2024-10-29
LLM中的一些概念
2024-12-02
由一個go中出現的異常引出對php與go中操作sql的一些分析
2021-01-01
GoPHPSQL

victoriaMetrics中的一些Sao操作