前言
簡單整合Prometheus+Grafana,指標的上報收集視覺化。
Prometheus
Prometheus
是一個監控平臺,監控從HTTP埠收集受監控目標的指標。在微服務的架構裡Prometheus
多維度的資料收集是非常強大的 我們首先下載安裝Prometheus
和node_exporter
,node_exporter
用於監控CPU、記憶體、磁碟、I/O等資訊
下載完成後解壓以管理員執行 prometheus.exe
訪問 http://localhost:9090/
出現一下頁面說明啟動成功啦
.Net Core獲取指標
有了Prometheus
,我們還需要給Prometheus
提供獲取監控資料的介面,我們新建一個WebApi專案,並匯入prometheus-net.AspNetCore
包,在Configure
中加入UseMetricServer
中介軟體
public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
{
app.UseMetricServer();
}
啟動專案訪問http://localhost:5000/metrics
就可以看基本的一些監控資訊啦,包括執行緒數,控制程式碼數,3個GC的回收計數等資訊。
# HELP process_num_threads Total number of threads
# TYPE process_num_threads gauge
process_num_threads 29
# HELP process_working_set_bytes Process working set
# TYPE process_working_set_bytes gauge
process_working_set_bytes 44441600
# HELP process_private_memory_bytes Process private memory size
# TYPE process_private_memory_bytes gauge
process_private_memory_bytes 69660672
# HELP dotnet_total_memory_bytes Total known allocated memory
# TYPE dotnet_total_memory_bytes gauge
dotnet_total_memory_bytes 2464584
# HELP dotnet_collection_count_total GC collection count
# TYPE dotnet_collection_count_total counter
dotnet_collection_count_total{generation="1"} 0
dotnet_collection_count_total{generation="0"} 0
dotnet_collection_count_total{generation="2"} 0
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1592448124.2853072
# HELP process_open_handles Number of open handles
# TYPE process_open_handles gauge
process_open_handles 413
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 2225187631104
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 1.171875
Help
是收集指標的說明,Type
收集指標的型別
但是作為HTTP應用怎麼能沒有HTTP的監控和計數呢,只需要加加入UseHttpMetrics
中介軟體就可以對HTTP請求監控和計數,主要注意的是UseHttpMetrics
最好放在UseEndpoints
和UseRouting
中間
public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
{
app.UseMetricServer();
app.UseRouting();
app.UseHttpMetrics();
app.UseEndpoints(endpoints => { endpoints.MapControllers(); });
}
啟動專案繼續訪問http://localhost:5000/metrics
# HELP http_requests_in_progress The number of requests currently in progress in the ASP.NET Core pipeline. One series without controller/action label values counts all in-progress requests, with separate series existing for each controller-action pair.
# TYPE http_requests_in_progress gauge
可以看到已經有了,我們隨便請求一下服務看看效果,會幫我們記錄下總耗時,總請求數,和每次請求的耗時數
但是單單有上面那些資料好像還不太好定位一下很奇葩的問題,這時候我們可以獲取Runtime
的一些資料,方法童謠很簡單。匯入prometheus-net.DotNetRuntime
包,它可以幫助我們看到如下指標
- 垃圾回收的收集頻率和時間
- 服務佔用堆大小
- 物件堆分配的位元組
- JIT編譯和JIT CPU消耗率
- 執行緒池大小,排程延遲以及增長/縮小的原因
- 鎖爭用情況
我們只需要在Program
的Main
方法中啟動收集器就可以啦。
public static void Main(string[] args)
{
DotNetRuntimeStatsBuilder.Default().StartCollecting();
CreateHostBuilder(args).Build().Run();
}
啟動專案繼續訪問http://localhost:5000/metrics
測試一下
# HELP dotnet_collection_count_total GC collection count
# TYPE dotnet_collection_count_total counter
dotnet_collection_count_total{generation="1"} 0
dotnet_collection_count_total{generation="0"} 0
dotnet_collection_count_total{generation="2"} 0
# HELP process_private_memory_bytes Process private memory size
# TYPE process_private_memory_bytes gauge
process_private_memory_bytes 75141120
# HELP dotnet_gc_pause_ratio The percentage of time the process spent paused for garbage collection
# TYPE dotnet_gc_pause_ratio gauge
dotnet_gc_pause_ratio 0
# HELP http_requests_received_total Provides the count of HTTP requests that have been processed by the ASP.NET Core pipeline.
# TYPE http_requests_received_total counter
# HELP dotnet_gc_collection_seconds The amount of time spent running garbage collections
# TYPE dotnet_gc_collection_seconds histogram
dotnet_gc_collection_seconds_sum 0
dotnet_gc_collection_seconds_count 0
dotnet_gc_collection_seconds_bucket{le="0.001"} 0
dotnet_gc_collection_seconds_bucket{le="0.01"} 0
dotnet_gc_collection_seconds_bucket{le="0.05"} 0
dotnet_gc_collection_seconds_bucket{le="0.1"} 0
dotnet_gc_collection_seconds_bucket{le="0.5"} 0
dotnet_gc_collection_seconds_bucket{le="1"} 0
dotnet_gc_collection_seconds_bucket{le="10"} 0
dotnet_gc_collection_seconds_bucket{le="+Inf"} 0
# HELP dotnet_total_memory_bytes Total known allocated memory
# TYPE dotnet_total_memory_bytes gauge
dotnet_total_memory_bytes 4925936
# HELP dotnet_threadpool_num_threads The number of active threads in the thread pool
# TYPE dotnet_threadpool_num_threads gauge
dotnet_threadpool_num_threads 0
# HELP dotnet_threadpool_scheduling_delay_seconds A breakdown of the latency experienced between an item being scheduled for execution on the thread pool and it starting execution.
# TYPE dotnet_threadpool_scheduling_delay_seconds histogram
dotnet_threadpool_scheduling_delay_seconds_sum 0.015556
dotnet_threadpool_scheduling_delay_seconds_count 10
dotnet_threadpool_scheduling_delay_seconds_bucket{le="0.001"} 0
dotnet_threadpool_scheduling_delay_seconds_bucket{le="0.01"} 10
dotnet_threadpool_scheduling_delay_seconds_bucket{le="0.05"} 10
dotnet_threadpool_scheduling_delay_seconds_bucket{le="0.1"} 10
dotnet_threadpool_scheduling_delay_seconds_bucket{le="0.5"} 10
dotnet_threadpool_scheduling_delay_seconds_bucket{le="1"} 10
dotnet_threadpool_scheduling_delay_seconds_bucket{le="10"} 10
dotnet_threadpool_scheduling_delay_seconds_bucket{le="+Inf"} 10
# HELP process_working_set_bytes Process working set
# TYPE process_working_set_bytes gauge
process_working_set_bytes 50892800
# HELP process_num_threads Total number of threads
# TYPE process_num_threads gauge
process_num_threads 32
# HELP dotnet_jit_method_seconds_total Total number of seconds spent in the JIT compiler
# TYPE dotnet_jit_method_seconds_total counter
dotnet_jit_method_seconds_total 0
dotnet_jit_method_seconds_total{dynamic="false"} 0.44558800000000004
dotnet_jit_method_seconds_total{dynamic="true"} 0.004122000000000001
# HELP dotnet_gc_pinned_objects The number of pinned objects
# TYPE dotnet_gc_pinned_objects gauge
dotnet_gc_pinned_objects 0
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1592449942.6063592
# HELP dotnet_gc_heap_size_bytes The current size of all heaps (only updated after a garbage collection)
# TYPE dotnet_gc_heap_size_bytes gauge
# HELP http_request_duration_seconds The duration of HTTP requests processed by an ASP.NET Core application.
# TYPE http_request_duration_seconds histogram
# HELP dotnet_contention_seconds_total The total amount of time spent contending locks
# TYPE dotnet_contention_seconds_total counter
dotnet_contention_seconds_total 0
# HELP dotnet_gc_pause_seconds The amount of time execution was paused for garbage collection
# TYPE dotnet_gc_pause_seconds histogram
dotnet_gc_pause_seconds_sum 0
dotnet_gc_pause_seconds_count 0
dotnet_gc_pause_seconds_bucket{le="0.001"} 0
dotnet_gc_pause_seconds_bucket{le="0.01"} 0
dotnet_gc_pause_seconds_bucket{le="0.05"} 0
dotnet_gc_pause_seconds_bucket{le="0.1"} 0
dotnet_gc_pause_seconds_bucket{le="0.5"} 0
dotnet_gc_pause_seconds_bucket{le="1"} 0
dotnet_gc_pause_seconds_bucket{le="10"} 0
dotnet_gc_pause_seconds_bucket{le="+Inf"} 0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 2225201872896
# HELP dotnet_gc_finalization_queue_length The number of objects waiting to be finalized
# TYPE dotnet_gc_finalization_queue_length gauge
dotnet_gc_finalization_queue_length 0
# HELP dotnet_threadpool_io_num_threads The number of active threads in the IO thread pool
# TYPE dotnet_threadpool_io_num_threads gauge
dotnet_threadpool_io_num_threads 3
# HELP process_open_handles Number of open handles
# TYPE process_open_handles gauge
process_open_handles 436
# HELP dotnet_gc_collection_reasons_total A tally of all the reasons that lead to garbage collections being run
# TYPE dotnet_gc_collection_reasons_total counter
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.890625
# HELP http_requests_in_progress The number of requests currently in progress in the ASP.NET Core pipeline. One series without controller/action label values counts all in-progress requests, with separate series existing for each controller-action pair.
# TYPE http_requests_in_progress gauge
# HELP dotnet_threadpool_adjustments_total The total number of changes made to the size of the thread pool, labeled by the reason for change
# TYPE dotnet_threadpool_adjustments_total counter
# HELP dotnet_jit_cpu_ratio The amount of total CPU time consumed spent JIT'ing
# TYPE dotnet_jit_cpu_ratio gauge
dotnet_jit_cpu_ratio 0.5728901224489797
# HELP process_cpu_count The number of processor cores available to this process.
# TYPE process_cpu_count gauge
process_cpu_count 8
# HELP dotnet_build_info Build information about prometheus-net.DotNetRuntime and the environment
# TYPE dotnet_build_info gauge
dotnet_build_info{version="3.3.1.0",target_framework=".NETCoreApp,Version=v5.0",runtime_version=".NET Core 5.0.0-preview.2.20160.6",os_version="Microsoft Windows 10.0.18363",process_architecture="X64"} 1
# HELP dotnet_jit_method_total Total number of methods compiled by the JIT compiler
# TYPE dotnet_jit_method_total counter
dotnet_jit_method_total{dynamic="false"} 830
dotnet_jit_method_total{dynamic="true"} 30
# HELP dotnet_gc_cpu_ratio The percentage of process CPU time spent running garbage collections
# TYPE dotnet_gc_cpu_ratio gauge
dotnet_gc_cpu_ratio 0
# HELP dotnet_threadpool_scheduled_total The total number of items the thread pool has been instructed to execute
# TYPE dotnet_threadpool_scheduled_total counter
dotnet_threadpool_scheduled_total 16
# HELP dotnet_gc_allocated_bytes_total The total number of bytes allocated on the small and large object heaps (updated every 100KB of allocations)
# TYPE dotnet_gc_allocated_bytes_total counter
dotnet_gc_allocated_bytes_total{gc_heap="soh"} 3008088
dotnet_gc_allocated_bytes_total{gc_heap="loh"} 805392
# HELP dotnet_contention_total The number of locks contended
# TYPE dotnet_contention_total counter
dotnet_contention_total 0
可以看到非常多的資訊啦,但是我們有時候不需要這麼多指標也可以自定義。
public static void Main(string[] args)
{
DotNetRuntimeStatsBuilder
.Customize()
.WithContentionStats()
.WithJitStats()
.WithThreadPoolSchedulingStats()
.WithThreadPoolStats()
.WithGcStats()
.StartCollecting();
CreateHostBuilder(args).Build().Run();
}
JIT,GC和執行緒的監控是會影響到一點點效能,我們可以通過sampleRate
這個列舉的值來控制取樣頻率
public static void Main(string[] args)
{
DotNetRuntimeStatsBuilder
.Customize()
//每5個事件個採集一個
.WithContentionStats(sampleRate: SampleEvery.FiveEvents)
//每10事件採集一個
.WithJitStats(sampleRate: SampleEvery.TenEvents)
//每100事件採集一個
.WithThreadPoolSchedulingStats(sampleRate: SampleEvery.HundredEvents)
.WithThreadPoolStats()
.WithGcStats()
.StartCollecting();
CreateHostBuilder(args).Build().Run();
}
有了這些指標我們需要Prometheus
來收集我們Api的指標,只需要修改prometheus.yml
檔案然後重啟Prometheus
就可以了。
scrape_configs:
- job_name: mydemo
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- localhost:5000
啟動Api專案和Prometheus
,選中dotnet_collection_count_total
點選Excute
可以看到Api的指標是正常上報的。
Prometheus有了資料我們就需要一個炫酷的UI去展示上報的資料啦。
Grafana
Prometheus有了資料就差一個漂亮的UI來展示的我們的指標了。Grafana是一個Go編寫的開源應用,用於把指標資料視覺化。是當下流行的時序資料展示工具。先下載,直接下載exe安裝,完成後能開啟http://localhost:3000/
頁面就安裝成功了
先新增資料來源,選擇Prometheus
為資料來源,並配置。
新增儀表盤
在Import via panel json
中加入下面這個json,點選load,
選擇資料來源,點選Import
就能看到儀表盤了
還可以去這裡新增很多現有的儀表盤。複製ID新增儀表盤。