安裝路徑 Prometheus
/usr/local/devops/prometheus
- 新增使用者組
groupadd prometheus
- 新增使用者
useradd -g prometheus -m -d /var/lib/prometheus -s /sbin/nologin prometheus
- 建立服務
vim /etc/systemd/system/prometheus.service
[Unit]
Description=prometheus
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/devops/prometheus/prometheus --config.file=/usr/local/devops/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus
Restart=on-failure
[Install]
WantedBy=multi-user.target
複製程式碼
- 下載 node_exporter 並解壓
vim /etc/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/devops/node_exporter/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
複製程式碼
Node Exporter預設的抓取地址為http://IP:9100/metrics, 在 prometheus.yml 檔案中新增
- job_name: 'node1-metrics'
static_configs:
- targets: ['localhost:9100']
labels:
instance: node1
複製程式碼
重啟 Prometheus 服務
prometheus 採用 pull 發方式採集資料,所以 exporter 需要按照 Prometheus 採集資料的格式暴露出需要採集的資料, 這個資料通過 curl ip:port/metrics 可以看得到,如:
curl 127.0.0.1:9100/metrics
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.55175897068e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.16535296e+08
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes -1
# HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
# TYPE promhttp_metric_handler_requests_in_flight gauge
promhttp_metric_handler_requests_in_flight 1
# HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
# TYPE promhttp_metric_handler_requests_total counter
複製程式碼
以第一行來舉例:#
號表示註釋說明 annotation,process_start_time_seconds
為 metric name,後邊的就是對應的值
Prometheus的Client Library提供度量的四種基本型別包括:
Counter
計數器 -Gauge
儀表盤Histogram
直方圖Summary
概要
當訪問Exporter的/metrics API地址時我們可以看到類似於一下返回值,其中HELP用於說明度量型別,TYPE用於資料型別說明。
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 255.477922
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 312.0
複製程式碼
Counter
Counter型別好比計數器,用於統計類似於:CPU時間,API訪問總次數,異常發生次數等等場景。這些指標的特點就是增加不減少。 因此當我們需要統計CPU的使用率時,我們需要使用rate()函式計算該Counter在過去一段時間內在每一個時間序列上的每秒的平均增長率
Gauge
Gauge型別,英文直譯的話叫“計量器”,但是和Counter的翻譯太類似了,因此我個人更喜歡使用”儀表盤“這個稱呼。儀表盤的特點就是數值是可以增加或者減少的。因此Gauge適合用於如:當前記憶體使用率,當前CPU使用率,當前溫度,當前速度等等一系列的監控指標。
Histogram
Histogram 柱狀圖這個比較直接,更多的是用於統計一些資料分佈的情況,用於計算在一定範圍內的分佈情況,同時還提供了度量指標值的總和。
Summary
Summary摘要和Histogram柱狀圖比較類似,主要用於計算在一定時間視窗範圍內度量指標物件的總數以及所有對量指標值的總和。
用 php 寫一個 psr-15 middleware 的 prometheus exporter
安裝prometheus php client
composer require jimdo/prometheus_client_php
class Prometheus implements MiddlewareInterface
{
/**
* @param ServerRequestInterface $request
* @param RequestHandlerInterface $handler
* @return ResponseInterface
* @throws \Prometheus\Exception\MetricsRegistrationException
*/
public function process(ServerRequestInterface $request, RequestHandlerInterface $handler): ResponseInterface
{
$start = microtime( true);
$uri = $request->getUri()->getPath();
$registry = new CollectorRegistry(new Redis());
// export
if ($uri === '/metrics') {
$render = new RenderTextFormat();
$metrics = $render->render($registry->getMetricFamilySamples());
$response = new TextResponse($metrics, 200);
// $response->withHeader('Content-type', RenderTextFormat::MIME_TYPE);
return $response;
}
$response = $handler->handle($request);
$end = microtime(true);
$duration = ($end - $start);
$statusCode = $response->getStatusCode();
$context = $request->getAttribute('context');
$routes = $context->getRoutes();
$method = $request->getMethod();
$labels = ['status_code', 'method', 'route'];
foreach ($routes as $route) {
$labelValues = [$method, $statusCode, $route];
$counter = $registry->registerCounter(
'knight',
'knight_request_total', 'Total number of HTTP requests',
$labels
);
$counter->inc($labelValues);
$histogram = $registry->registerHistogram(
'knight',
'knight_request_duration_seconds',
'duration histogram of http responses',
$labels,
[0.005, 0.05, 0.1, 0.5, 1.5, 10]
);
$histogram->observe($duration, $labelValues);
}
return $response;
}
}
複製程式碼
curl https://blog.sangsay.com/api/metrics
response:
# HELP knight_knight_request_duration_seconds duration histogram of http responses
# TYPE knight_knight_request_duration_seconds histogram
knight_knight_request_duration_seconds_bucket{status_code="GET",method="200",route="/posts",le="0.005"} 0
knight_knight_request_duration_seconds_bucket{status_code="GET",method="200",route="/posts",le="0.05"} 4
knight_knight_request_duration_seconds_bucket{status_code="GET",method="200",route="/posts",le="0.1"} 4
knight_knight_request_duration_seconds_bucket{status_code="GET",method="200",route="/posts",le="0.5"} 4
knight_knight_request_duration_seconds_bucket{status_code="GET",method="200",route="/posts",le="1.5"} 4
knight_knight_request_duration_seconds_bucket{status_code="GET",method="200",route="/posts",le="10"} 4
knight_knight_request_duration_seconds_bucket{status_code="GET",method="200",route="/posts",le="+Inf"} 4
knight_knight_request_duration_seconds_count{status_code="GET",method="200",route="/posts"} 4
knight_knight_request_duration_seconds_sum{status_code="GET",method="200",route="/posts"} 0.0942027568817144
# HELP knight_knight_request_total Total number of HTTP requests
# TYPE knight_knight_request_total counter
knight_knight_request_total{status_code="GET",method="200",route="/posts"} 4
複製程式碼