prometheus之docker監控與告警系列(一)

2遠發表於2018-09-14

本系列主要介紹prometheus+cadvisor+alertmanager打造docker監控,主要監控指定docker容器是否掛掉

本節主要熟悉prometheus的部署和基本使用

一、部署環境

mac

二、下載prometheus的mac版

進入下載頁,作業系統選擇darwin

prometheus.io/download/

download.png

三、解壓,進入目錄執行

./prometheus --config.file=prometheus.yml

四、瀏覽監控頁面,檢視對自身監控的一些資訊

圖表頁: http://localhost:9090/graph metrics頁面: http://localhost:9090/metrics

1)檢視控制檯日誌輸出 切換到Console嘗試輸入表示式,並點選execute,檢視結果

prometheus_target_interval_length_seconds

image.png

count(prometheus_target_interval_length_seconds)

image.png

更多表示式參考:expression language documentation

2)檢視圖表輸出

切換到Graph嘗試輸入表示式,並點選execute,檢視結果

rate(prometheus_tsdb_head_chunks_created_total[1m])

image.png

五、做點實驗

1)確保已經安裝了go的開發環境,並配置了環境變數,golang的protobuf依賴是最新的

# Fetch the client library code and compile example.
git clone https://github.com/prometheus/client_golang.git
cd client_golang/examples/random
go get -d
go build

# Start 3 example targets in separate terminals:
./random -listen-address=:8080
./random -listen-address=:8081
./random -listen-address=:8082
複製程式碼

接下來分別開啟以下網址檢視metrics

http://localhost:8080/metrics http://localhost:8081/metrics http://localhost:8082/metrics

2)修改prometheus.yml

scrape_configs:
  - job_name:       'example-random'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:8080', 'localhost:8081']
        labels:
          group: 'production'

      - targets: ['localhost:8082']
        labels:
          group: 'canary'
複製程式碼

開啟網址 http://localhost:9090/ 輸入以下條件過濾,檢視console

rpc_durations_seconds

image.png

3)根據給定規則輸出監控內容 新增prometheus.rules.yml

groups:
- name: example
  rules:
  - record: job_service:rpc_durations_seconds_count:avg_rate5m
    expr: avg(rate(rpc_durations_seconds_count[5m])) by (job, service)
複製程式碼

修改prometheus.yml指定prometheus.rules.yml

global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # Evaluate rules every 15 seconds.

  # Attach these extra labels to all timeseries collected by this Prometheus instance.
  external_labels:
    monitor: 'codelab-monitor'

rule_files:
  - 'prometheus.rules.yml'

scrape_configs:
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:9090']

  - job_name:       'example-random'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:8080', 'localhost:8081']
        labels:
          group: 'production'

      - targets: ['localhost:8082']
        labels:
          group: 'canary'
複製程式碼

在localhost:9090的console檢視新規則的聚合資料

job_service:rpc_durations_seconds_count:avg_rate5m

image.png

參考:prometheus.io/docs/promet…

歡迎繼續閱讀:

prometheus之docker監控與告警系列(二)

prometheus之docker監控與告警系列(三)

相關文章