使用Prometheus搞定微服務監控

Kevin Wan發表於2021-03-01

原文網址 : https://www.cnblogs.com/kevinwan/p/14463445.html

最近對服務進行監控，而當前監控最流行的資料庫就是 Prometheus，同時 go-zero 預設接入也是這款資料庫。今天就對 go-zero 是如何接入 Prometheus ，以及開發者如何自己定義自己監控指標。

監控接入

go-zero 框架中整合了基於 prometheus 的服務指標監控。但是沒有顯式開啟，需要開發者在 config.yaml 中配置：

Prometheus:
  Host: 127.0.0.1
  Port: 9091
  Path: /metrics

如果開發者是在本地搭建 Prometheus，需要在 Prometheus 的配置檔案 prometheus.yaml 中寫入需要收集服務監控資訊的配置：

- job_name: 'file_ds'
    static_configs:
      - targets: ['your-local-ip:9091']
        labels:
          job: activeuser
          app: activeuser-api
          env: dev
          instance: your-local-ip:service-port

因為本地是用 docker 執行的。將 prometheus.yaml 放置在 docker-prometheus 目錄下：

docker run \
    -p 9090:9090 \
    -v dockeryml/docker-prometheus:/etc/prometheus \
    prom/prometheus

開啟 localhost:9090 就可以看到：

點選 http://service-ip:9091/metrics 就可以看到該服務的監控資訊：

上圖我們可以看出有兩種 bucket，以及 count/sum 指標。

那 go-zero 是如何整合監控指標？監控的又是什麼指標？我們如何定義我們自己的指標？下面就來解釋這些問題

以上的基本接入，可以參看我們的另外一篇：https://zeromicro.github.io/go-zero/service-monitor.html

如何整合

上面例子中的請求方式是 HTTP，也就是在請求服務端時，監控指標資料不斷被蒐集。很容易想到是 中介軟體 的功能，具體程式碼：https://github.com/tal-tech/go-zero/blob/master/rest/handler/prometheushandler.go。

var (
	metricServerReqDur = metric.NewHistogramVec(&metric.HistogramVecOpts{
		...
    // 監控指標
		Labels:    []string{"path"},
    // 直方圖分佈中，統計的桶
		Buckets:   []float64{5, 10, 25, 50, 100, 250, 500, 1000},
	})

	metricServerReqCodeTotal = metric.NewCounterVec(&metric.CounterVecOpts{
		...
    // 監控指標：直接在記錄指標 incr() 即可
		Labels:    []string{"path", "code"},
	})
)

func PromethousHandler(path string) func(http.Handler) http.Handler {
	return func(next http.Handler) http.Handler {
		return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
      // 請求進入的時間
			startTime := timex.Now()
			cw := &security.WithCodeResponseWriter{Writer: w}
			defer func() {
        // 請求返回的時間
				metricServerReqDur.Observe(int64(timex.Since(startTime)/time.Millisecond), path)
				metricServerReqCodeTotal.Inc(path, strconv.Itoa(cw.Code))
			}()
			// 中介軟體放行，執行完後續中介軟體和業務邏輯。重新回到這，做一個完整請求的指標上報
      // [?：洋蔥模型]
			next.ServeHTTP(cw, r)
		})
	}
}

其實整個很簡單：

HistogramVec 負責請求耗時蒐集：
- bucket 存放的就是 option 指定的耗時指標。某個請求耗時多少就會被聚集對應的桶，計數。
- 最終展示的就是一個路由在不同耗時的分佈，很直觀提供給開發者可以優化的區域。
CounterVec 負責指定 labels 標籤蒐集：
- Labels: []string{"path", "code"}
- labels 相當一個 tuple。go-zero 是以(path, code)作為整體，記錄不同路由不同狀態碼的返回次數。如果 4xx,5xx過多的時候，是不是應該看看你的服務健康程度？

如何自定義

go-zero 中也提供了 prometheus metric 基本封裝，供開發者自己開發自己 prometheus 中介軟體。

程式碼：https://github.com/tal-tech/go-zero/tree/master/core/metric

名稱	用途	蒐集函式
CounterVec	單一的計數。用做：QPS統計	`CounterVec.Inc()` 指標+1
GuageVec	單純指標記錄。適用於磁碟容量，CPU/Mem使用率（可增加可減少）	`GuageVec.Inc()/GuageVec.Add()` 指標+1/指標加N，也可以為負數
HistogramVec	反應數值的分佈情況。適用於：請求耗時、響應大小	`HistogramVec.Observe(val, labels)` 記錄指標當前對應值，並找到值所在的桶，+1

另外對 HistogramVec.Observe() 做一個基本分析：

我們其實可以看到上圖每個 HistogramVec 統計都會有3個序列出現：

_count：資料個數

_sum：全部資料加和

_bucket{le=a1}：處於 [-inf, a1] 的資料個數

所以我們也猜測在統計過程中，分3種資料進行統計：
// 基本上在prometheus的統計都是使用 atomic CAS 方式進行計數的
// 效能要比使用 Mutex 要高
func (h *histogram) observe(v float64, bucket int) {
	n := atomic.AddUint64(&h.countAndHotIdx, 1)
	hotCounts := h.counts[n>>63]

	if bucket < len(h.upperBounds) {
    // val 對應資料桶 +1
		atomic.AddUint64(&hotCounts.buckets[bucket], 1)
	}
	for {
		oldBits := atomic.LoadUint64(&hotCounts.sumBits)
		newBits := math.Float64bits(math.Float64frombits(oldBits) + v)
    // sum指標數值 +v（畢竟是總數sum）
		if atomic.CompareAndSwapUint64(&hotCounts.sumBits, oldBits, newBits) {
			break
		}
	}
	// count 統計 +1
	atomic.AddUint64(&hotCounts.count, 1)
}

所以開發者想定義自己的監控指標：

在使用 goctl 生成API程式碼指定要生成的 中介軟體：https://zeromicro.github.io/go-zero/middleware.html
在中介軟體檔案書寫自己需要統計的指標邏輯
當然，開發者也可以在業務邏輯中書寫統計的指標邏輯。同上。

上述都是針對 HTTP 部分邏輯的解析，RPC 部分的邏輯類似，你可以在 攔截器 部分看到設計。

總結

本文分析了 go-zero 服務監控指標的邏輯，當然對於一些基礎設施的監控，prometheus 可以通過引入對應的 exporter 來完成。

專案地址

https://github.com/tal-tech/go-zero

歡迎使用 go-zero 並 star 支援我們！

go-zero 系列文章見『微服務實踐』公眾號

基於 prometheus 的微服務指標監控
2020-11-02
Prometheus微服務指標
go-kit 微服務服務監控 (prometheus 實現)
2020-02-12
Go微服務Prometheus
SpringBoot使用prometheus監控
2019-03-17
Spring BootPrometheus
使用Prometheus監控Flink
2020-07-27
Prometheus
微服務監控
2018-11-20
微服務
使用Prometheus監控Golang服務-基於YoyoGo框架
2020-08-11
PrometheusGolang框架
微服務監控探索
2020-04-04
微服務
使用 Prometheus-Operator 監控 Calico
2020-06-29
Prometheus
Grafana+Prometheus 監控 MySql服務
2018-08-13
GrafanaPrometheusMySql
prometheus監控golang服務實踐
2020-11-17
PrometheusGolang
帶你十天輕鬆搞定 Go 微服務系列（八、服務監控）
2022-01-26
Go微服務
Skywalking微服務監控分析
2019-01-04
微服務
prometheus JVM監控
2024-05-10
PrometheusJVM
Prometheus監控mongo
2024-04-30
PrometheusGo
Prometheus 監控arangodb
2020-12-26
PrometheusGo
6.prometheus監控--監控docker
2024-04-24
PrometheusDocker
SpringCloud使用Prometheus監控(基於Eureka)
2019-03-17
SpringGCCloudPrometheus
Spring Boot中使用Prometheus監控教程
2024-05-21
Spring BootPrometheus
使用Prometheus、Grafana監控Artifactory實踐
2021-05-21
PrometheusGrafana
【flask】使用prometheus_client監控服務相關狀態
2024-12-09
FlaskPrometheusclient
手把手教你使用 Prometheus 監控 JVM
2020-10-13
PrometheusJVM
使用 Prometheus 監控 SAP ABAP 應用程式
2022-06-06
Prometheus
Prometheus監控神器-服務發現篇（二）
2020-09-07
Prometheus
prometheus 監控學習
2018-09-01
Prometheus
prometheus監控+alertmanager告警
2024-03-07
Prometheus
05 . Prometheus監控Nginx
2020-06-21
PrometheusNginx
11.prometheus監控之黑盒(blackbox)監控
2024-04-24
Prometheus
Prometheus+Grafana實現服務效能監控：windows主機監控、Spring Boot監控、Spring Cloud Alibaba Seata監控
2023-11-02
PrometheusGrafanaWindowsSpring BootCloud
go-kit微服務：API監控
2019-02-21
Go微服務API
微服務：指標和健康監控
2022-09-04
微服務指標
使用Prometheus和Grafana監控Spring Boot應用
2018-12-16
PrometheusGrafanaSpring Boot
基於Prometheus和Grafana打造業務監控看板
2020-08-17
PrometheusGrafana
Java服務端監控：Prometheus與Grafana的整合
2024-09-01
Java服務端PrometheusGrafana
Prometheus監控神器-Rules篇
2020-08-07
Prometheus
prometheus+grafana 監控nginx
2024-10-12
PrometheusGrafanaNginx
prometheus監控04-AlertManager
2024-11-25
Prometheus
Prometheus監控之Blackbox Exporter
2024-11-27
PrometheusExport
Prometheus MySQL監控+grafana展示
2021-08-08
PrometheusMySqlGrafana

使用Prometheus搞定微服務監控

監控接入

如何整合

如何自定義

總結

專案地址

相關文章