k8s附加元件CoreDNS v1.11.3部署及故障排查

尹正杰發表於2024-12-04

原文網址 : https://www.cnblogs.com/yinzhengjie/p/18585364

　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　作者：尹正傑

版權宣告：原創作品，謝絕轉載！否則將追究法律責任。

一.部署CoreDNS附加元件
- 1.部署coreDNS附加元件思路
- 2.編寫資源清單
- 3.驗證DNS元件是否正常工作
二.解決CoreDNS附加元件部署排查
- 1.報錯資訊
- 2.錯誤原因分析
- 3.解決方案

一.部署CoreDNS附加元件

1.部署coreDNS附加元件思路


參考連結:
	 https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/coredns/coredns.yaml.base
	 https://raw.githubusercontent.com/kubernetes/kubernetes/refs/heads/master/cluster/addons/dns/coredns/coredns.yaml.base
	 https://github.com/coredns/coredns 
	 
	1.下載coreDNS資源清單 
[root@node-exporter41 ~]# wget https://raw.githubusercontent.com/kubernetes/kubernetes/refs/heads/master/cluster/addons/dns/coredns/coredns.yaml.base


	2.修改coredns.yaml.base配置檔案
參考命令:
sed -e "s/__DNS__SERVER__/10.200.0.254/g" -e "s/__DNS__DOMAIN__/oldboyedu.com/g" -e "s/__DNS__MEMORY__LIMIT__/300Mi/g" coredns.yaml.base > coredns.yaml
 

__DNS__DOMAIN__ :
	表示的是coreDNS的域名，比如"oldboyedu.com"
	
	
__DNS__MEMORY__LIMIT__:
	配置記憶體的上限。
	
__DNS__SERVER__:
	修改為CoreDNS的地址，我們案例是： 10.200.0.254

2.編寫資源清單

[root@node-exporter41 ~]# cat coredns.yaml
# __MACHINE_GENERATED_WARNING__

apiVersion: v1
kind: ServiceAccount
metadata:
  name: coredns
  namespace: kube-system
  labels:
      kubernetes.io/cluster-service: "true"
      addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
    addonmanager.kubernetes.io/mode: Reconcile
  name: system:coredns
rules:
- apiGroups:
  - ""
  resources:
  - endpoints
  - services
  - pods
  - namespaces
  verbs:
  - list
  - watch
- apiGroups:
  - discovery.k8s.io
  resources:
  - endpointslices
  verbs:
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
    addonmanager.kubernetes.io/mode: EnsureExists
  name: system:coredns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:coredns
subjects:
- kind: ServiceAccount
  name: coredns
  namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
  labels:
      addonmanager.kubernetes.io/mode: EnsureExists
data:
  Corefile: |
    .:53 {
        errors
        health {
            lameduck 5s
        }
        ready
        kubernetes yinzhengjie.com in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
            max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: coredns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "CoreDNS"
spec:
  # replicas: not specified here:
  # 1. In order to make Addon Manager do not reconcile this replicas parameter.
  # 2. Default is 1.
  # 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
  replicas: 2
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  selector:
    matchLabels:
      k8s-app: kube-dns
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      priorityClassName: system-cluster-critical
      serviceAccountName: coredns
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                  - key: k8s-app
                    operator: In
                    values: ["kube-dns"]
              topologyKey: kubernetes.io/hostname
      tolerations:
        - key: "CriticalAddonsOnly"
          operator: "Exists"
      nodeSelector:
        kubernetes.io/os: linux
      containers:
      - name: coredns
        image: registry.k8s.io/coredns/coredns:v1.11.3
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            memory: 300Mi
          requests:
            cpu: 100m
            memory: 70Mi
        args: [ "-conf", "/etc/coredns/Corefile" ]
        volumeMounts:
        - name: config-volume
          mountPath: /etc/coredns
          readOnly: true
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 9153
          name: metrics
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        readinessProbe:
          httpGet:
            path: /ready
            port: 8181
            scheme: HTTP
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - ALL
          readOnlyRootFilesystem: true
      dnsPolicy: Default
      volumes:
        - name: config-volume
          configMap:
            name: coredns
            items:
            - key: Corefile
              path: Corefile
---
apiVersion: v1
kind: Service
metadata:
  name: kube-dns
  namespace: kube-system
  annotations:
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "CoreDNS"
spec:
  selector:
    k8s-app: kube-dns
  clusterIP: 10.200.0.254
  ports:
  - name: dns
    port: 53
    protocol: UDP
  - name: dns-tcp
    port: 53
    protocol: TCP
  - name: metrics
    port: 9153
    protocol: TCP
[root@node-exporter41 ~]# 
[root@node-exporter41 ~]# kubectl get pods -o wide -n kube-system 
NAME                      READY   STATUS    RESTARTS   AGE   IP               NODE              NOMINATED NODE   READINESS GATES
coredns-64cf9f859-ccbrm   1/1     Running   0          7s    10.100.246.203   node-exporter43   <none>           <none>
coredns-64cf9f859-dcdwx   1/1     Running   0          7s    10.100.59.149    node-exporter41   <none>           <none>
[root@node-exporter41 ~]#

3.驗證DNS元件是否正常工作

[root@node-exporter41 ~]# kubectl get svc,pods -n kube-system 
NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                  AGE
service/coredns   ClusterIP   10.200.0.254   <none>        53/UDP,53/TCP,9153/TCP   14h

NAME                           READY   STATUS    RESTARTS   AGE
pod/coredns-859664f9d8-2fl7l   1/1     Running   0          89s
pod/coredns-859664f9d8-stdbs   1/1     Running   0          89s
[root@node-exporter41 ~]# 
[root@node-exporter41 ~]# kubectl get svc -A
NAMESPACE          NAME                              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                  AGE
calico-apiserver   calico-api                        ClusterIP   10.200.93.100    <none>        443/TCP                  16h
calico-system      calico-kube-controllers-metrics   ClusterIP   None             <none>        9094/TCP                 15h
calico-system      calico-typha                      ClusterIP   10.200.250.163   <none>        5473/TCP                 16h
default            kubernetes                        ClusterIP   10.200.0.1       <none>        443/TCP                  17h
kube-system        coredns                           ClusterIP   10.200.0.254     <none>        53/UDP,53/TCP,9153/TCP   14h
[root@node-exporter41 ~]# 
[root@node-exporter41 ~]# dig @10.200.0.254 calico-api.calico-apiserver.svc.yinzhengjie.com +short
10.200.93.100
[root@node-exporter41 ~]# 
[root@node-exporter41 ~]# dig @10.200.0.254 calico-typha.calico-system.svc.yinzhengjie.com +short
10.200.250.163
[root@node-exporter41 ~]#

二.解決CoreDNS附加元件部署排查

1.報錯資訊

[FATAL] plugin/loop: Loop (127.0.0.1:36030 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 8244365230594049349.2552766472385065880."

2.錯誤原因分析

CoreDNS元件本地的DNS解析和Pod解析迴環問題導致的錯誤。
	
參考連結:
  https://coredns.io/plugins/loop#troubleshooting

3.解決方案

	如果修改本地的"/etc/resolv.conf"你會發現，修改後會被覆蓋！因此我們需要自行定義一個檔案解析記錄。
	
	1.所有節點新增解析記錄
echo "nameserver 223.5.5.5" > /etc/kubernetes/resolv.conf

	2.所有節點修改kubelet的配置檔案
# vim /etc/kubernetes/kubelet-conf.yml 
...
resolvConf: /etc/kubernetes/resolv.conf   

	3.所有節點重啟kubelet元件 
systemctl daemon-reload
systemctl restart kubelet

004.OpenShift命令及故障排查
2020-06-20
1個工具，助你提升K8S故障排查效率！
2020-04-21
K8S
rsync 故障排查整理
2018-12-09
應用故障排查
2020-12-24
光纖故障診斷和故障排查
2020-02-25
MogDB openGauss故障排查流程
2024-03-14
第十三課 SOLIDITY語法難點解析及故障排查
2018-11-15
Solid
k8s_centos7 安裝部署coredns
2020-12-14
K8SCentOSDNS
k8s叢集配置使用coredns代替kube-dns
2018-07-11
K8SDNS
Tungsten Fabric入門寶典丨8個典型故障及排查Tips
2020-05-18
阿里雲K8S元件Cloud Controller Manager升級問題排查
2022-06-13
阿里K8S元件CloudController
kubernetes實踐之六十八：部署 coredns 外掛
2018-08-13
DNS
Ubuntu1804下k8s-CoreDNS佔CPU高問題排查
2021-06-03
UbuntuK8SDNS
一次“不負責任”的 K8s 網路故障排查經驗分享
2021-06-23
K8S
k8s環境部署及使用方式
2018-12-15
K8S
MySQL——MHA高可用群集部署及故障測試
2020-11-03
MySql
MySQL高可用群集MHA部署及故障測試分析
2020-11-05
MySql
linux出現故障字符集亂碼故障排查思路
2021-11-19
Linux
K8s（Kubernetes）簡介及安裝部署
2021-03-24
K8S
程式設計師筆記|常見的SpringMVC故障排查及解決方案
2019-05-17
程式設計師筆記SpringMVC
記IPSec VPN對接故障的排查
2019-12-25
伺服器網路故障如何排查
2022-03-03
伺服器
Java 8 記憶體管理原理解析及記憶體故障排查實踐
2024-03-21
Java記憶體
伺服器raid5陣列故障排查及資料恢復方法篇
2020-02-21
伺服器AI陣列資料恢復
記一次線上 K8s Ingress 訪問故障排查，最後竟不是 Post 的鍋
2022-11-08
K8S
故障排查工具-strace,tcpdump的簡單使用
2020-08-17
TCP
伺服器的路由故障怎麼排查
2022-02-22
伺服器路由
k8s之mutating webhook + gin（附加除錯技巧）
2021-11-12
K8SWebHook除錯
coredns 排錯記
2019-02-10
DNS
[K8S 系列]k8s 學習一，Kubernetes 基本介紹及核心元件
2021-12-23
K8S元件
JVM線上CPU 飈高故障排查基本操作
2019-04-10
JVM
在Linux中，如何進行系統故障排查？
2024-04-11
Linux
在Linux中，如何進行網路故障排查？
2024-06-02
Linux
Linux CPU 上下文切換的故障排查
2022-10-09
Linux
ElasticSearch- 單節點 unassigned_shards 故障排查
2021-01-12
Elasticsearch
kubernetes實踐之五十三：Service中的故障排查
2018-06-10
記一次詭異的故障排查經歷
2018-07-24
醫院運維告警閃現後的故障排查
2023-12-05
運維

k8s附加元件CoreDNS v1.11.3部署及故障排查

一.部署CoreDNS附加元件

1.部署coreDNS附加元件思路

2.編寫資源清單

3.驗證DNS元件是否正常工作

二.解決CoreDNS附加元件部署排查

1.報錯資訊

2.錯誤原因分析

3.解決方案

相關文章