kube-prometheus安裝

BUG弄潮儿發表於2024-10-19

下載kube-prometheus

wget https://github.com/prometheus-operator/kube-prometheus/archive/refs/tags/v0.14.0.tar.gz

安裝

tar -zxvf v0.14.0.tar.gz 
cd v0.14.0
kubectl apply --server-side -f manifests/setup
kubectl wait --for condition=Established --all CustomResourceDefinition --namespace=monitoring
kubectl apply -f manifests/

檢視安裝情況

kubectl get pod,svc -n monitoring -o wide

刪除

kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup

參考

https://www.cnblogs.com/liugp/p/16444580.html

使用NodePort型別訪問

  • prometheus-service.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/component: prometheus
    app.kubernetes.io/instance: k8s
    app.kubernetes.io/name: prometheus
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 2.54.1
  name: prometheus-k8s
  namespace: monitoring
spec:
  type: NodePort #增加
  ports:
  - name: web
    port: 9090
    nodePort: 30080 #增加
    targetPort: web
  - name: reloader-web
    port: 8080
    targetPort: reloader-web
  selector:
    app.kubernetes.io/component: prometheus
    app.kubernetes.io/instance: k8s
    app.kubernetes.io/name: prometheus
    app.kubernetes.io/part-of: kube-prometheus
  sessionAffinity: ClientIP
  • grafana-service.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/component: grafana
    app.kubernetes.io/name: grafana
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 11.2.0
  name: grafana
  namespace: monitoring
spec:
  type: NodePort #增加
  ports:
  - name: http
    port: 3000
    nodePort: 30081 #增加
    targetPort: http
  selector:
    app.kubernetes.io/component: grafana
    app.kubernetes.io/name: grafana
    app.kubernetes.io/part-of: kube-prometheus
  • alertmanager-service.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/component: alert-router
    app.kubernetes.io/instance: main
    app.kubernetes.io/name: alertmanager
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 0.27.0
  name: alertmanager-main
  namespace: monitoring
spec:
  type: NodePort #增加
  ports:
  - name: web
    port: 9093
    nodePort: 30082 #增加
    targetPort: web
  - name: reloader-web
    port: 8080
    targetPort: reloader-web
  selector:
    app.kubernetes.io/component: alert-router
    app.kubernetes.io/instance: main
    app.kubernetes.io/name: alertmanager
    app.kubernetes.io/part-of: kube-prometheus
  sessionAffinity: ClientIP

將service型別由"ClusterIP"改為"NodePort"無法使用nodeip+埠訪問服務解決方法.

解決方法是刪除monitoring名稱空間下的網路策略讓其從新載入pod間網路

kubectl delete networkpolicy --all -n monitoring

替換國內可用映象

cd 0.14.0/manifests

grep -riE 'ghcr.io/|egistry.k8s.io/|quay.io|k8s.gcr|grafana/' *

grafana/grafana:11.2.0 替換成 swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/grafana/grafana:11.2.0

registry.k8s.io/prometheus-adapter/prometheus-adapter:v0.12.0 替換成 swr.cn-north-4.myhuaweicloud.com/ddn-k8s/registry.k8s.io/prometheus-adapter/prometheus-adapter:v0.12.0

ghcr.io/jimmidyson/configmap-reload:v0.13.1 替換成 swr.cn-east-3.myhuaweicloud.com/kubesre/ghcr.io/jimmidyson/configmap-reload:v0.13.1

registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.13.0 替換成 swr.cn-north-4.myhuaweicloud.com/ddn-k8s/registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.13.0

檢視自定義資源CRD

kubectl get customresourcedefinitions.apiextensions.k8s.io | grep monitoring.coreos.com 

定義規則

https://developer.aliyun.com/article/1046908

apiVersion: monitoring.coreos.com/v1

kind: PrometheusRule
metadata:
labels:
prometheus: k8s
ole: alert-rules
name: cusstom-rule
namespace: monitoring
spec:
groups:

  • name: disk
    rules:
    • alert: diskFree
      annotations:
      summary: "{{ $labels.job }} 專案例項 {{ $labels.instance }} 磁碟使用率大於 80%"
      description: "{{ $labels.instance }} {{ $labels.mountpoint }} 磁碟使用率大於80% (當前的值: {{ $value }}%),請及時處理"
      expr: |
      (1-(node_filesystem_free_bytes{fstype=~"ext4|xfs",mountpoint!="/boot"} / node_filesystem_size_bytes{fstype=~"ext4|xfs",mountpoint!="/boot"}) )*100 > 80
      for: 1m
      labels:
      level: disaster
      severity: warning

https://gitee.com/crazywjj/K8S/blob/main/promethues/prometheus告警規則.md

參考

https://cloud.tencent.cn/developer/article/2327634
https://docker.aityp.com/manage/add
https://github.com/prometheus-operator/kube-prometheus
https://www.cnblogs.com/huangjiabobk/p/18126130

相關文章