Prometheus Operator(二) 監控k8s元件

正在輸入中…………發表於2020-12-10

Prometheus Operator(二) 監控k8s元件

預設情況下,prometheus operator已經可以監控我們的叢集,但是無法監控kube-controller-manager和kube-scheduler。 這裡我們將這2個元件進行監控,並將prometheus和grafana新增traefik。通過ingress進行訪問

分類檔案

這裡將operator檔案進行分類

wget -P /root/ http://down.i4t.com/abcdocker-prometheus-operator.yaml.zip
cd /root/
unzip abcdocker-prometheus-operator.yaml.zip
mkdir kube-prom
cp -a kube-prometheus-master/manifests/* kube-prom/
cd kube-prom/
mkdir -p node-exporter alertmanager grafana kube-state-metrics prometheus serviceMonitor adapter operator
mv *-serviceMonitor* serviceMonitor/
mv setup operator/
mv grafana-* grafana/
mv kube-state-metrics-* kube-state-metrics/
mv alertmanager-* alertmanager/
mv node-exporter-* node-exporter/
mv prometheus-adapter* adapter/
mv prometheus-* prometheus/
mv 0prometheus-operator-* operator/
mv 00namespace-namespace.yaml operator/
 
 
## 安裝順序也需要改變 (之前已經安裝也可以跳過)
[root@k8s-01 kube-prom]# kubectl apply -f operator/
namespace/monitoring created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
service/prometheus-operator created
serviceaccount/prometheus-operator created
 
Pod啟動了就可以執行剩下的
[root@k8s-01 kube-prom]# kubectl -n monitoring get pod
NAME                                   READY   STATUS    RESTARTS   AGE
prometheus-operator-69bd579bf9-7kpd7   1/1     Running   0          7s
 
#剩下步驟
kubectl apply -f adapter/
kubectl apply -f alertmanager/
kubectl apply -f node-exporter/
kubectl apply -f kube-state-metrics/
kubectl apply -f grafana/
kubectl apply -f prometheus/
kubectl apply -f serviceMonitor/
 
執行完檢查沒問題就可以結束了
[root@k8s-01 kube-prom]# kubectl get -n monitoring all

配置Ingress

首先需要先安裝traefik,node-port方式效率不行,建議使用traefik

可參考Kubernetes Traefik Ingress

環境初始化

首先我們需要將prometheus operator中的svc型別都修改為ClusterIP,如果預設沒有修改的話,預設就是ClusterIP

[root@k8s-01 ingress]# kubectl get pod,svc -n monitoring
NAME                                       READY   STATUS    RESTARTS   AGE
pod/alertmanager-main-0                    2/2     Running   0          88s
pod/alertmanager-main-1                    2/2     Running   0          77s
pod/alertmanager-main-2                    2/2     Running   0          69s
pod/grafana-558647b59-mj85j                1/1     Running   0          96s
pod/kube-state-metrics-5bfc7db74d-kpgh2    4/4     Running   0          96s
pod/node-exporter-5kz8x                    2/2     Running   0          94s
pod/node-exporter-jnmr7                    2/2     Running   0          94s
pod/node-exporter-pztln                    2/2     Running   0          93s
pod/node-exporter-ts455                    2/2     Running   0          94s
pod/prometheus-adapter-57c497c557-6tscz    1/1     Running   0          91s
pod/prometheus-k8s-0                       3/3     Running   1          78s
pod/prometheus-k8s-1                       3/3     Running   1          78s
pod/prometheus-operator-69bd579bf9-rrf96   1/1     Running   1          98s
 
NAME                            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/alertmanager-main       ClusterIP   10.254.201.109           9093/TCP            99s
service/alertmanager-operated   ClusterIP   None                     9093/TCP,6783/TCP   89s
service/grafana                 ClusterIP   10.254.19.174            3000/TCP            97s
service/kube-state-metrics      ClusterIP   None                     8443/TCP,9443/TCP   96s
service/node-exporter           ClusterIP   None                     9100/TCP            95s
service/prometheus-adapter      ClusterIP   10.254.197.151           443/TCP             93s
service/prometheus-k8s          ClusterIP   10.254.120.188           9090/TCP            89s
service/prometheus-operated     ClusterIP   None                     9090/TCP            78s
service/prometheus-operator     ClusterIP   None                     8080/TCP            99s

接下來我們為prometheus ui和grafana以及alertmanager建立ingress

(可以分開寫,不寫在一個檔案裡面)

vim ingress.yaml
 
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: prometheus-ing
  namespace: monitoring
spec:
  rules:
  - host: prometheus.i4t.com
    http:
      paths:
      - backend:
          serviceName: prometheus-k8s
          servicePort: 9090
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: grafana-ing
  namespace: monitoring
spec:
  rules:
  - host: grafana.i4t.com
    http:
      paths:
      - backend:
          serviceName: grafana
          servicePort: 3000
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: alertmanager-ing
  namespace: monitoring
spec:
  rules:
  - host: alertmanager.i4t.com
    http:
      paths:
      - backend:
          serviceName: alertmanager-main
          servicePort: 9093
 
 
## host為域名,serviceName是prometheus的svc名稱和埠
[root@k8s-01 ingress]# kubectl apply -f ingress.yaml
ingress.extensions/prometheus-operator created
 
 
[root@k8s-01 ingress]# kubectl get ingress -n monitoring
NAME               HOSTS                  ADDRESS   PORTS   AGE
alertmanager-ing   alertmanager.i4t.com             80      13s
grafana-ing        grafana.i4t.com                  80      13s
prometheus-ing     prometheus.i4t.com               80      13s

我們也可以在ui介面檢視traefik

image_1e2ct189i103rfb71tg8fm1hoc9.png-162.5kB

接下來進行域名解析 (我這裡使用修改host方式演示)

#mac
➜  ~ sudo vim /etc/hosts
Password:
 
#windows
C:\Windows\System32\drivers\etc

image_1e2ct6jp8ebi2v51ihj7ct1nomm.png-126.7kB


監控k8s元件

這裡我們可以看到,prometheus operator並沒有監控到kube-controller-managerscheduler由於我這裡是二進位制安裝,所以並沒有獲取到相關的資訊

image_1e2ct8c7j3eo159ejd1qgt1n7l13.png-122.1kB

這是由於serverMonitor根據label去選取svc的,我們可以看到對應的serviceMonitor選取的範圍是kube-system

[root@k8s-01 manifests]#  grep -2 selector prometheus-serviceMonitorKube*
prometheus-serviceMonitorKubeControllerManager.yaml-    matchNames:
prometheus-serviceMonitorKubeControllerManager.yaml-    - kube-system
prometheus-serviceMonitorKubeControllerManager.yaml:  selector:
prometheus-serviceMonitorKubeControllerManager.yaml-    matchLabels:
prometheus-serviceMonitorKubeControllerManager.yaml-      k8s-app: kube-controller-manager
--
prometheus-serviceMonitorKubelet.yaml-    matchNames:
prometheus-serviceMonitorKubelet.yaml-    - kube-system
prometheus-serviceMonitorKubelet.yaml:  selector:
prometheus-serviceMonitorKubelet.yaml-    matchLabels:
prometheus-serviceMonitorKubelet.yaml-      k8s-app: kubelet
--
prometheus-serviceMonitorKubeScheduler.yaml-    matchNames:
prometheus-serviceMonitorKubeScheduler.yaml-    - kube-system
prometheus-serviceMonitorKubeScheduler.yaml:  selector:
prometheus-serviceMonitorKubeScheduler.yaml-    matchLabels:
prometheus-serviceMonitorKubeScheduler.yaml-      k8s-app: kube-scheduler

而kube-system預設裡也沒有符合標籤的label

[root@k8s-01 manifests]# kubectl get svc -n kube-system
NAME                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                       AGE
kube-dns                  ClusterIP   10.254.0.2               53/UDP,53/TCP,9153/TCP        31d
kubelet                   ClusterIP   None                     10250/TCP                     2d8h
kubernetes-dashboard      NodePort    10.254.194.101           80:30000/TCP                  31d
traefik-ingress-service   NodePort    10.254.160.25            80:23633/TCP,8080:15301/TCP   38m

但是卻有endpoint (我這裡二進位制安裝有)

[root@k8s-01 manifests]# kubectl get ep -n kube-system
NAME                      ENDPOINTS                                                              AGE
kube-controller-manager                                                                    31d
kube-dns                  172.30.248.2:53,172.30.72.4:53,172.30.248.2:53 + 3 more...             31d
kube-scheduler                                                                             31d
kubelet                   192.168.0.10:10255,192.168.0.11:10255,192.168.0.12:10255 + 9 more...   2d8h
kubernetes-dashboard      172.30.232.2:8443                                                      31d
traefik-ingress-service   172.30.232.5:80,172.30.232.5:8080                                      39m

解決辦法

這裡建立兩個管理元件的svc,將svc的label設定為k8s-app: {kube-controller-manager、kube-scheduler},這樣就可以被servicemonitor選中

建立一個svc用來繫結

apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-controller-manager
  labels:
    k8s-app: kube-controller-manager
spec:
  selector:
    component: kube-controller-manager
  type: ClusterIP
  clusterIP: None
  ports:
  - name: http-metrics
    port: 10252
    targetPort: 10252
    protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-scheduler
  labels:
    k8s-app: kube-scheduler
spec:
  selector:
    component: kube-scheduler
  type: ClusterIP
  clusterIP: None
  ports:
  - name: http-metrics
    port: 10251
    targetPort: 10251
    protocol: TCP

手動填寫svc對應的ep的屬性,ep的名稱要和svc名稱和屬性對應上

apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-controller-manager
  name: kube-controller-manager
  namespace: kube-system
subsets:
- addresses:
  - ip: 192.168.0.10
  - ip: 192.168.0.11
  - ip: 192.168.0.12
  ports:
  - name: http-metrics
    port: 10252
    protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: kube-system
subsets:
- addresses:
  - ip: 192.168.0.10
  - ip: 192.168.0.11
  - ip: 192.168.0.12
  ports:
  - name: http-metrics
    port: 10251
    protocol: TCP

我們檢視一下svc,已經和我們ep進行繫結

[root@k8s-01 test]# kubectl get svc -n kube-system
NAME                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                       AGE
kube-controller-manager   ClusterIP   None                     10252/TCP                     64s
kube-dns                  ClusterIP   10.254.0.2               53/UDP,53/TCP,9153/TCP        31d
kube-scheduler            ClusterIP   None                     10251/TCP                     64s
kubelet                   ClusterIP   None                     10250/TCP                     2d9h
kubernetes-dashboard      NodePort    10.254.194.101           80:30000/TCP                  31d
traefik-ingress-service   NodePort    10.254.160.25            80:23633/TCP,8080:15301/TCP   126m
[root@k8s-01 test]# kubectl describe svc -n kube-system kube-scheduler
Name:              kube-scheduler
Namespace:         kube-system
Labels:            k8s-app=kube-scheduler
Annotations:       kubectl.kubernetes.io/last-applied-configuration:
                     {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"k8s-app":"kube-scheduler"},"name":"kube-scheduler","namespace"...
Selector:          component=kube-scheduler
Type:              ClusterIP
IP:                None
Port:              http-metrics  10251/TCP
TargetPort:        10251/TCP
Endpoints:         192.168.0.10:10251,192.168.0.11:10251,192.168.0.12:10251
Session Affinity:  None
Events:            

我這裡master就3個所以scheduler和kube-controller-manager就只有3個

image_1e2d2lojpag063p19cbs731g8k1g.png-305.5kB

image_1e2d4ferpud11hkmufa159d1jqd1t.png-314.8kB

針對kubeadm可以參考下面的解決方法,由於我這裡沒有環境所以不進行演示

apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kubelet
  name: kubelet
  namespace: kube-system
subsets:
- addresses:
  - ip: 172.16.0.14
    targetRef:
      kind: Node
      name: k8s-n2
  - ip: 172.16.0.18
    targetRef:
      kind: Node
      name: k8s-n3
  - ip: 172.16.0.2
    targetRef:
      kind: Node
      name: k8s-m1
  - ip: 172.16.0.20
    targetRef:
      kind: Node
      name: k8s-n4
  - ip: 172.16.0.21
    targetRef:
      kind: Node
      name: k8s-n5
  ports:
  - name: http-metrics
    port: 10255
    protocol: TCP
  - name: cadvisor
    port: 4194
    protocol: TCP
  - name: https-metrics
    port: 10250
    protocol: TCP

如果我們新增監控後提示ip:10251 Connection refused

  • 二進位制安裝

需要修改scheduler的配置檔案

在啟動檔案中新增--bind-address=0.0.0.0
  • kubeadm安裝

需要在在修改Pod中新增,我不太瞭解kubeadm這裡不過多說明

相關文章