K8S雲原生-高可用叢集部署V1.28.2

李天翔發表於2024-08-08

一、環境準備

K8S叢集角色 IP 主機名 安裝相關元件
master 10.1.16.160 hqiotmaster07l apiserver、controller-manager、scheduler、kubelet、etcd、docker、kube-proxy、keepalived、nginx、calico
master 10.1.16.161 hqiotmaster08l apiserver、controller-manager、scheduler、kubelet、etcd、docker、kube-proxy、keepalived、nginx、calico
master 10.1.16.162 hqiotmaster09l apiserver、controller-manager、scheduler、kubelet、etcd、docker、kube-proxy、keepalived、nginx、calico
worker 10.1.16.163 hqiotnode12l kubelet、kube-porxy、docker、calico、coredns、ingress-nginx
worker 10.1.16.164 hqiotnode13l kubelet、kube-porxy、docker、calico、coredns、ingress-nginx
worker 10.1.16.165 hqiotnode14l kubelet、kube-porxy、docker、calico、coredns、ingress-nginx
vip 10.1.16.202 nginx、keeplived

1.1、伺服器環境初始化

# 控制節點、工作節點都需要安裝
# 1、修改主機名:對應主機名修改
hostnamectl set-hostname master && bash

# 2、新增hosts
cat << EOF >  /etc/hosts 
10.1.16.160 hqiotmaster07l
10.1.16.161 hqiotmaster08l
10.1.16.162 hqiotmaster09l
10.1.16.163 hqiotnode12l
10.1.16.164 hqiotnode13l
10.1.16.165 hqiotnode14l
EOF

# 3、關閉防火牆
systemctl stop firewalld
systemctl disable firewalld

# 4、關閉selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config

# 5
關閉交換分割槽 swapoff -a # 臨時關閉 永久關閉
vi /etc/fstab
#註釋這一行:/mnt/swap swap swap defaults 0 0
free -m
檢視swap是否全為0


# 6、每臺機器都設定 時間同步
yum install chrony -y
systemctl start chronyd && systemctl enable chronyd
chronyc sources

# 7、建立/etc/modules-load.d/containerd.conf配置檔案:
cat << EOF > /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
執行以下命令使配置生效:
modprobe overlay
modprobe br_netfilter

# 8、將橋接的IPv4流量傳遞到iptables的鏈
cat << EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
user.max_user_namespaces=28633
EOF

# 9、配置伺服器支援開啟ipvs的前提條件(如果用istio,請不要開啟IPVS模式)

接下來還需要確保各個節點上已經安裝了ipset軟體包,為了便於檢視ipvs的代理規則,最好安裝一下管理工具ipvsadm。

yum install -y ipset ipvsadm


由於ipvs已經加入到了核心的主幹,所以為kube-proxy開啟ipvs的前提需要載入以下的核心模組:ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack_ipv4

在各個伺服器節點上執行以下指令碼:

cat << EOF > /etc/sysconfig/modules/ipvs.modules
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF

賦權:

chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4

上面指令碼建立了的/etc/sysconfig/modules/ipvs.modules檔案,保證在節點重啟後能自動載入所需模組。

如果報錯modprobe: FATAL: Module nf_conntrack_ipv4 not found.

這是因為使用了高核心,較如博主就是使用了5.2的核心,一般教程都是3.2的核心。在高版本核心已經把nf_conntrack_ipv4替換為nf_conntrack了。所以正確配置應該如下

在各個伺服器節點上執行以下指令碼:

cat <<EOF> /etc/sysconfig/modules/ipvs.modules
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
EOF

賦權:

chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack

# 10、生效sysctl
sysctl --system

二、基礎軟體包安裝

yum install -y gcc gcc-c++ make

yum install wget net-tools vim* nc telnet-server telnet curl openssl-devel libnl3-devel net-snmp-devel zlib zlib-devel pcre-devel openssl openssl-devel

# 修改linux命令歷史記錄、ssh關閉時間
vi /etc/profile
HISTSIZE=3000
TMOUT=3600
   
退出儲存,執行
source /etc/profile

三、Docker安裝

安裝yum的工具包集合
yum install -y yum-utils

安裝docker倉庫
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

解除安裝docker-ce
yum remove docker docker-client docker-client-latest docker-common docker-latest docker-latest-logrotate docker-logrotate docker-selinux docker-engine-selinux docker-engine

yum list installed | grep docker
yum remove -y docker-ce.x86_64

rm -rf /var/lib/docker
rm -rf /etc/docker/

檢視可安裝版本
yum list docker-ce --showduplicates | sort -r
安裝最新版本
yum -y install docker-ce

安裝特定版本的docker-ce:
yum -y install docker-ce-23.0.3-1.el7

啟動docker,並設為開機自啟動
systemctl enable docker && systemctl start docker

/etc/docker上傳daemon.json
systemctl daemon-reload
systemctl restart docker.service
docker info

docker相關命令:
systemctl stop docker
systemctl start docker
systemctl enable docker
systemctl status docker
systemctl restart docker
docker info
docker --version
containerd --version

四、containerd安裝

下載Containerd的二進位制包:
可先在網路可達的機器上下載好,再上傳到伺服器

wget https://github.com/containerd/containerd/releases/download/v1.7.14/cri-containerd-cni-1.7.14-linux-amd64.tar.gz

壓縮包中已經按照官方二進位制部署推薦的目錄結構佈局好。 裡面包含了systemd配置檔案,containerd以及cni的部署檔案。 
將解壓縮到系統的根目錄中:
tar -zvxf cri-containerd-cni-1.7.14-linux-amd64.tar.gz -C /

注意經測試cri-containerd-cni-1.7.14-linux-amd64.tar.gz包中包含的runc在CentOS 7下的動態連結有問題,
這裡從runc的github上單獨下載runc,並替換上面安裝的containerd中的runc:
wget https://github.com/opencontainers/runc/releases/download/v1.1.10/runc.amd64

install -m 755 runc.amd64 /usr/sbin/runc
runc -v

runc version 1.1.10
commit: v1.1.10-0-g18a0cb0f
spec: 1.0.2-dev
go: go1.20.10
libseccomp: 2.5.4

接下來生成containerd的配置檔案:
rm -rf /etc/containerd
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml

根據文件 Container runtimes 中的內容,對於使用systemd作為init system的Linux的發行版,使用systemd作為容器的cgroup driver可以確保伺服器節點在資源緊張的情況更加穩定,因此這裡配置各個節點上containerd的cgroup driver為systemd。
修改前面生成的配置檔案/etc/containerd/config.toml

sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  ...
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true

# 設定aliyun地址,不設定會連線不上
sed -i "s#registry.k8s.io/pause#registry.aliyuncs.com/google_containers/pause#g" /etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri"]
  ...
  # sandbox_image = "k8s.gcr.io/pause:3.6"
  sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"

# 設定Harbor私有倉庫
vi /etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri".registry.configs]
  [plugins."io.containerd.grpc.v1.cri".registry.configs."10.1.1.167".tls]
    insecure_skip_verify = true
  [plugins."io.containerd.grpc.v1.cri".registry.configs."10.1.1.167".auth]
    username = "admin"
    password = "Harbor12345"
    
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."k8s.gcr.io"]
    endpoint = ["https://registry.aliyuncs.com/google_containers"]

  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."10.1.1.167"]
    endpoint = ["https://10.1.1.167"]

# 配置containerd開機啟動,並啟動containerd
systemctl daemon-reload
systemctl enable --now containerd && systemctl restart containerd

# 使用crictl測試一下,確保可以列印出版本資訊並且沒有錯誤資訊輸出:
crictl version

Version:  0.1.0
RuntimeName:  containerd
RuntimeVersion:  v1.7.14
RuntimeApiVersion:  v1

五、安裝配置kubernetes

5.1 kubernetes高可用方案

為了能很好的講解Kubernetes叢集的高可用配置,我們可以透過一下方案來解答。

在這個方案中,我們透過keepalive+nginx實現k8s apiserver元件高可用。

按照舊的方案,我們以某一個master節點作為主節點,讓其餘的兩個master節點加入,是無法達到叢集的高可用的。一旦主master節點當機,整個叢集將處於不可用的狀態。

K8S雲原生-高可用叢集部署V1.28.2

K8S雲原生-高可用叢集部署V1.28.2

5.2 透過keepalive+nginx實現k8s apiserver高可用

三臺master節點,Nginx安裝與配置

yum -y install gcc zlib zlib-devel pcre-devel openssl openssl-devel

tar -zvxf nginx-1.27.0.tar.gz

cd nginx-1.27.0

全量安裝
./configure --prefix=/usr/local/nginx --with-stream --with-http_stub_status_module --with-http_ssl_module

make & make install

ln -s /usr/local/nginx/sbin/nginx /usr/sbin/

nginx -v

cd /usr/local/nginx/sbin/
#啟動服務
./nginx
#停止服務
./nginx -s stop
#檢視80埠
netstat -ntulp |grep 80

建立服務,啟動服務方式
vi /usr/lib/systemd/system/nginx.service

[Unit]
Description=nginx - high performance web server
After=network.target remote-fs.target nss-lookup.target

[Service]
Type=forking
ExecStart=/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
ExecReload=/usr/local/nginx/sbin/nginx -s reload
ExecStop=/usr/local/nginx/sbin/nginx -s stop

[Install]
WantedBy=multi-user.target

上傳nginx.service 到  /usr/lib/systemd/system
systemctl daemon-reload
systemctl start nginx.service && systemctl enable nginx.service
systemctl status nginx.service

修改nginx 配置檔案

#user  nobody;
worker_processes  1;

#error_log  logs/error.log;
error_log  /var/log/nginx/error.log;
#error_log  logs/error.log  info;

pid        /var/log/nginx/nginx.pid;


events {
    worker_connections  1024;
}

stream { 
 
    log_format main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent'; 
 
    access_log /var/log/nginx/k8s-access.log main; 
 
    upstream k8s-apiserver { 
            server 10.1.16.169:6443 weight=5 max_fails=3 fail_timeout=30s;   
            server 10.1.16.170:6443 weight=5 max_fails=3 fail_timeout=30s;
            server 10.1.16.171:6443 weight=5 max_fails=3 fail_timeout=30s;   
 
    }
    server { 
        listen 16443; # 由於 nginx 與 master 節點複用,這個監聽埠不能是 6443,否則會衝突 
        proxy_pass k8s-apiserver; 
    }

}

http {
    include       mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 4096;

    #gzip  on;

    server {
        listen       8080;
        server_name  localhost;

        location / {
            root   html;
            index  index.html index.htm;
        }

        error_page  404              /404.html;

        # redirect server error pages to the static page /50x.html
        #
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }
    }
}

重啟nginx.service
systemctl restart nginx.service
三臺master節點,Keepalived安裝與配置

yum install -y curl gcc openssl-devel libnl3-devel net-snmp-devel

yum install -y keepalived

cd /etc/keepalived/

mv keepalived.conf keepalived.conf.bak

vi /etc/keepalived/keepalived.conf

# master節點1配置
! Configuration File for keepalived

global_defs {
   router_id NGINX_MASTER
}

vrrp_script chk_apiserver {
    script "/etc/keepalived/check_apiserver.sh"
    interval 5 
    weight -5
    fall 2
    rise 1
}

vrrp_instance VI_1 {
    state MASTER
    interface ens192   # 網路卡名稱        
    mcast_src_ip 10.1.16.160 # 伺服器IP   
    virtual_router_id 51   #vrrp路由ID例項,每個例項唯一
    priority 100       # 權重
    nopreempt
    advert_int 2   # 指定vrrp心跳包通告間隔時間,預設1s
    authentication {
        auth_type PASS
        auth_pass K8SHA_KA_AUTH
    }
    virtual_ipaddress {
        10.1.16.202/24   # 虛擬VIP
    }
    track_script {
        chk_apiserver   # 健康檢查指令碼
    }
}

# master節點2配置
! Configuration File for keepalived
global_defs {
   router_id LVS_DEVEL
   script_user root
   enable_script_security
}
vrrp_script chk_apiserver {
    script "/etc/keepalived/check_apiserver.sh"
    interval 5 
    weight -5
    fall 2
    rise 1
}
vrrp_instance VI_1 {
    state BACKUP1
    interface ens192
    mcast_src_ip 10.1.16.161
    virtual_router_id 51
    priority 99
    nopreempt
    advert_int 2
    authentication {
        auth_type PASS
        auth_pass K8SHA_KA_AUTH
    }
    virtual_ipaddress {
        10.1.16.202/24
    }
    track_script {
        chk_apiserver
    }
}

# master節點2配置
! Configuration File for keepalived
global_defs {
   router_id LVS_DEVEL
   script_user root
   enable_script_security
}
vrrp_script chk_apiserver {
    script "/etc/keepalived/check_apiserver.sh"
    interval 5 
    weight -5
    fall 2
    rise 1
}
vrrp_instance VI_1 {
    state BACKUP2
    interface ens192
    mcast_src_ip 10.1.16.162
    virtual_router_id 51
    priority 98
    nopreempt
    advert_int 2
    authentication {
        auth_type PASS
        auth_pass K8SHA_KA_AUTH
    }
    virtual_ipaddress {
        10.1.16.202/24
    }
    track_script {
        chk_apiserver
    }
}

#  健康檢查指令碼
vi  /etc/keepalived/check_apiserver.sh

#!/bin/bash
err=0
for k in $(seq 1 3)
do
    check_code=$(pgrep haproxy)
    if [[ $check_code == "" ]]; then
        err=$(expr $err + 1)
        sleep 1
        continue
    else
        err=0
        break
    fi
done
if [[ $err != "0" ]]; then
    echo "systemctl stop keepalived"
    /usr/bin/systemctl stop keepalived
    exit 1
else
    exit 0
fi

賦權:
chmod 644 /etc/keepalived/check_apiserver.sh
chmod 644 /etc/keepalived/keepalived.conf

啟動:
systemctl daemon-reload
systemctl start keepalived && systemctl enable keepalived
systemctl restart keepalived
systemctl status keepalived


# 檢視VIP,在master上看
[root@master nginx]# ip addr
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:9d:e5:7a brd ff:ff:ff:ff:ff:ff
    altname enp11s0
    inet 10.1.16.160/24 brd 10.1.16.255 scope global noprefixroute ens192
       valid_lft forever preferred_lft forever
    inet 10.1.16.202/24 scope global secondary ens192
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:fe9d:e57a/64 scope link noprefixroute
       valid_lft forever preferred_lft forever

測試:停止master的nginx就會發現10.1.16.202這個IP漂移到master2伺服器上,重啟master的nginx和keepalived後,IP還會漂移回master

5.3 使用kubeadm部署Kubernetes

# 下面在各節點安裝kubeadm和kubelet,建立yum源:
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
        http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum makecache fast

# 如已經安裝了相關元件,建議先徹底刪除

# 重置kubernetes服務,重置網路。刪除網路配置,link
kubeadm reset
systemctl stop kubelet
systemctl stop docker
rm -rf /var/lib/cni/
rm -rf /var/lib/kubelet/*
rm -rf /etc/cni/
ifconfig cni0 down
ifconfig docker0 down
ip link delete cni0
systemctl start docker
systemctl start kubelet

# 刪除kubernetes相關軟體
yum -y remove kubelet kubeadm kubectl
rm -rvf $HOME/.kube
rm -rvf ~/.kube/
rm -rvf /etc/kubernetes/
rm -rvf /etc/systemd/system/kubelet.service.d
rm -rvf /etc/systemd/system/kubelet.service
rm -rvf /usr/bin/kube*
rm -rvf /etc/cni
rm -rvf /opt/cni
rm -rvf /var/lib/etcd
rm -rvf /var/etcd

# 檢視kubelet kubeadm kubectl版本
yum list kubelet kubeadm kubectl  --showduplicates | sort -r

# 安裝k8s軟體包,master和node都需要
yum install -y  kubelet-1.28.2 kubeadm-1.28.2 kubectl-1.28.2

systemctl daemon-reload
systemctl enable kubelet && systemctl start kubelet

kubernetes相關命令:
systemctl enable kubelet
systemctl restart kubelet
systemctl stop kubelet
systemctl start kubelet
systemctl status kubelet
kubelet --version

注:每個軟體包的作用
Kubeadm: kubeadm 是一個工具,用來初始化 k8s 叢集的
kubelet: 安裝在叢集所有節點上,用於啟動 Pod 的,kubeadm 安裝k8s,k8s 控制節點和工作節點的元件,都是基於 pod 執行的,只要 pod 啟動,就需要 kubelet 
kubectl: 透過 kubectl 可以部署和管理應用,檢視各種資源,建立、刪除和更新各種元件

5.4 kubeadm 初始化

使用kubeadm config print init-defaults --component-configs KubeletConfiguration可以列印叢集初始化預設的使用的配置:

從預設的配置中可以看到,可以使用imageRepository定製在叢集初始化時拉取k8s所需映象的地址。
基於預設配置定製出本次使用kubeadm初始化叢集所需的配置檔案kubeadm.yaml

# 新建kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
# localAPIEndpoint:
#   advertiseAddress: 10.1.16.160
#   bindPort: 6443
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock
  imagePullPolicy: IfNotPresent
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.28.2
controlPlaneEndpoint: 10.1.16.202:16443  # 控制平面使用虛擬IP
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 20.244.0.0/16  # 指定pod網段
scheduler: {}
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs

這裡定製了imageRepository為阿里雲的registry,避免因gcr被牆,無法直接拉取映象。criSocket設定了容器執行時為containerd。 同時設定kubelet的cgroupDriver為systemd,設定kube-proxy代理模式為ipvs。
在開始初始化叢集之前可以使用kubeadm config images pull --config kubeadm.yaml預先在各個伺服器節點上拉取所k8s需要的容器映象。

# 拉取所k8s需要的容器映象
kubeadm config images pull --config kubeadm.yaml

[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.28.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.28.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.28.2
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.9
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.9-0
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.10.1

# 如果出現無法下載的問題,可以線下匯出匯入
ctr -n k8s.io image export kube-proxy.tar registry.aliyuncs.com/google_containers/kube-proxy:v1.28.2
ctr -n k8s.io image import kube-proxy.tar

# 使用kubeadm初始化叢集
kubeadm init --config kubeadm.yaml

# 檢視初始化結果
[root@HQIOTMASTER10L yum.repos.d]# kubeadm init --config kubeadm.yaml
[init] Using Kubernetes version: v1.28.2
[preflight] Running pre-flight checks
        [WARNING FileExisting-tc]: tc not found in system path
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [hqiotmaster10l kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.1.16.169 10.1.16.201]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [hqiotmaster10l localhost] and IPs [10.1.16.160 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [hqiotmaster10l localhost] and IPs [10.1.16.160 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
W0715 16:18:15.468503   67623 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
W0715 16:18:15.544132   67623 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
W0715 16:18:15.617290   67623 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
W0715 16:18:15.825899   67623 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 31.523308 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node hqiotmaster10l as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node hqiotmaster10l as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
W0715 16:18:51.448813   67623 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

  kubeadm join 10.1.16.202:16443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:0cc00fbdbfaa12d6d784b2f20c36619c6121a1dbd715f380fae53f8406ab6e4c \
        --control-plane

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.1.16.202:16443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:0cc00fbdbfaa12d6d784b2f20c36619c6121a1dbd715f380fae53f8406ab6e4c
        
        
上面記錄了完成的初始化輸出的內容,根據輸出的內容基本上可以看出手動初始化安裝一個Kubernetes叢集所需要的關鍵步驟。 
# 其中有以下關鍵內容:
    • [certs]生成相關的各種證書
    • [kubeconfig]生成相關的kubeconfig檔案
    • [kubelet-start] 生成kubelet的配置檔案"/var/lib/kubelet/config.yaml"
    • [control-plane]使用/etc/kubernetes/manifests目錄中的yaml檔案建立apiserver、controller-manager、scheduler的靜態pod
    • [bootstraptoken]生成token記錄下來,後邊使用kubeadm join往叢集中新增節點時會用到
    • [addons]安裝基本外掛:CoreDNS, kube-proxy


# 配置使用kubectl訪問叢集:
rm -rvf $HOME/.kube
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config


# 檢視一下叢集狀態,確認個元件都處於healthy狀態
kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                           STATUS             MESSAGE                         ERROR
scheduler                      Healthy            ok                              
controller-manager             Healthy            ok                              
etcd-0                         Healthy            {"health":"true","reason":""}   

# 驗證 kubectl
[root@k8s-master-0 ~]# kubectl get nodes
NAME             STATUS     ROLES           AGE     VERSION
hqiotmaster07l   NotReady   control-plane   2m12s   v1.28.2

5.5 擴容k8s叢集,新增master

# 1. 從節點拉取映象
# 將kubeadm.yaml傳送到master2、master3,提前拉取所需映象
kubectl config images pull --config=kubeadm.yaml

# 2.將master節點證書複製到其餘master節點
mkdir -p /etc/kubernetes/pki/etcd/

scp /etc/kubernetes/pki/ca.* master2:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/ca.* master3:/etc/kubernetes/pki/

scp /etc/kubernetes/pki/sa.* master2:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.* master3:/etc/kubernetes/pki/

scp /etc/kubernetes/pki/front-proxy-ca.* master2:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.* master3:/etc/kubernetes/pki/

scp /etc/kubernetes/pki/etcd/ca.* master2:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/etcd/ca.* master3:/etc/kubernetes/pki/etcd/

# 3.在master主節點生成token
[root@master etcd]# kubeadm token create --print-join-command
kubeadm join 10.1.16.202:16443 --token warf9k.w5m9ami6z4f73v1h --discovery-token-ca-cert-hash sha256:fa99f534d4940bcabff7a155582757af6a27c98360380f01b4ef8413dfa39918

# 4.將master2、master3加入叢集,成為控制節點
kubeadm join 10.1.16.202:16443 --token warf9k.w5m9ami6z4f73v1h --discovery-token-ca-cert-hash sha256:fa99f534d4940bcabff7a155582757af6a27c98360380f01b4ef8413dfa39918 --control-plane

成功結果:Run 'kubectl get nodes' to see this node join the cluster.

# 5.master2/3執行kubectl訪問叢集
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config


# 6.檢視
[root@master k8s]# kubectl get nodes
NAME      STATUS     ROLES           AGE   VERSION
master    NotReady   control-plane   97m   v1.28.2
master2   NotReady   control-plane   85m   v1.28.2
master3   NotReady   control-plane   84m   v1.28.2

5.6 新增node節點進入叢集

# 1.將node1加入叢集作為工作節點

[root@node1 containerd]# kubeadm join 10.1.16.202:16443 --token warf9k.w5m9ami6z4f73v1h --discovery-token-ca-cert-hash sha256:fa99f534d4940bcabff7a155582757af6a27c98360380f01b4ef8413dfa39918

成功標誌:Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

# 在任意master節點檢視
[root@master k8s]# kubectl get nodes
NAME      STATUS     ROLES           AGE    VERSION
master    NotReady   control-plane   109m   v1.28.2
master2   NotReady   control-plane   97m    v1.28.2
master3   NotReady   control-plane   96m    v1.28.2
node1     NotReady   <none>          67s    v1.28.2

# 2.修改node節點 ROLES
[root@master k8s]# kubectl label node node1 node-role.kubernetes.io/worker=worker
node/node1 labeled
[root@master k8s]# kubectl get nodes
NAME      STATUS     ROLES           AGE     VERSION
master    NotReady   control-plane   110m    v1.28.2
master2   NotReady   control-plane   98m     v1.28.2
master3   NotReady   control-plane   97m     v1.28.2
node1     NotReady   worker          2m48s   v1.28.2

六、在master01安裝包管理器helm 3

# 檢視最新版本https://github.com/helm/helm/releases

mkdir -p /usr/local/helm
wget https://get.helm.sh/helm-v3.11.2-linux-amd64.tar.gz
tar -zvxf helm-v3.15.3-linux-amd64.tar.gz
mv linux-amd64/helm  /usr/local/bin/
執行helm list確認沒有錯誤輸出。
helm version

七、安裝kubernetes網路外掛calico

選擇calico作為k8s的Pod網路元件,下面使用helm在k8s叢集中安裝calico。
下載tigera-operator的helm chart:
wget https://github.com/projectcalico/calico/releases/download/v3.25.1/tigera-operator-v3.25.1.tgz

# 檢視這個chart的中可定製的配置:
helm show values tigera-operator-v3.27.2.tgz

新建values.yaml如下:
# 可針對上面的配置進行定製,例如calico的映象改成從私有庫拉取。
# 這裡只是個人本地環境測試k8s新版本,這裡只有下面幾行配置
apiServer:
  enabled: false


# 先拉取映象
crictl pull quay.io/tigera/operator:v1.32.5
crictl pull docker.io/calico/cni:v3.27.2
crictl pull docker.io/calico/csi:v3.27.2
crictl pull docker.io/calico/kube-controllers:v3.27.2
crictl pull docker.io/calico/node-driver-registrar:v3.27.2
crictl pull docker.io/calico/node:v3.27.2
crictl pull docker.io/calico/pod2daemon-flexvol:v3.27.2
crictl pull docker.io/calico/typha:v3.27.2


# 如不能下載,就嘗試匯出匯入
ctr -n k8s.io image import operator.tar
ctr -n k8s.io image import cni.tar
ctr -n k8s.io image import csi.tar
ctr -n k8s.io image import kube-controllers.tar
ctr -n k8s.io image import node-driver-registrar.tar
ctr -n k8s.io image import node.tar
ctr -n k8s.io image import pod2daemon-flexvol.tar
ctr -n k8s.io image import typha.tar
ctr -n k8s.io image import busyboxplus.tar

# 使用helm安裝calico:
helm install calico tigera-operator-v3.27.2.tgz -n kube-system  --create-namespace -f values.yaml

NAME: calico
LAST DEPLOYED: Fri Nov 10 09:19:36 2023
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None

# 檢視calico容器執行情況,如狀態一直沒有到達Running狀態,可嘗試重啟節點伺服器。
[root@HQIOTMASTER07L ~]# kubectl get pod -n calico-system
NAME                                       READY   STATUS    RESTARTS       AGE
calico-kube-controllers-6c8966c899-k4b7t   1/1     Running   1 (7d7h ago)   7d7h
calico-node-bksbh                          1/1     Running   1 (7d7h ago)   7d7h
calico-node-kjqsq                          1/1     Running   0              7d7h
calico-node-lwhk9                          1/1     Running   1 (7d7h ago)   7d7h
calico-node-wdmws                          1/1     Running   0              7d7h
calico-node-xqkkq                          1/1     Running   1 (7d7h ago)   7d7h
calico-node-z56lx                          1/1     Running   0              7d7h
calico-typha-78f6f6c7dd-b8hfm              1/1     Running   1 (7d7h ago)   7d7h
calico-typha-78f6f6c7dd-kwjq2              1/1     Running   1 (7d7h ago)   7d7h
calico-typha-78f6f6c7dd-r2cjp              1/1     Running   1 (7d7h ago)   7d7h
csi-node-driver-452cl                      2/2     Running   0              7d7h
csi-node-driver-48bbw                      2/2     Running   2 (7d7h ago)   7d7h
csi-node-driver-52zbp                      2/2     Running   2 (7d7h ago)   7d7h
csi-node-driver-bnmzf                      2/2     Running   2 (7d7h ago)   7d7h
csi-node-driver-w2tfr                      2/2     Running   2 (7d7h ago)   7d7h
csi-node-driver-zw62c                      2/2     Running   2 (7d7h ago)   7d7h

# 在master01驗證k8s DNS是否可用
首次驗證:
kubectl run curl --image=radial/busyboxplus:curl -it

If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$
[ root@curl:/ ]$ exit

# 重新進入相同的容器可繼續執行命令
kubectl exec -it curl -- /bin/sh
進入後執行nslookup kubernetes.default確認解析正常:
[root@hqiotmaster07l yum.repos.d]# kubectl exec -it curl -- /bin/sh
/bin/sh: shopt: not found
[ root@curl:/ ]$ nslookup kubernetes.default
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes.default
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

八、安裝kubernetes反向代理ingress-nginx

新增ingress的helm倉庫   
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm repo list
搜尋ingress-nginx    
helm search repo ingress-nginx
helm search repo ingress-nginx/ingress-nginx -l

安裝ingress-nginx   等所有work節點加入後,執行
helm pull ingress-nginx/ingress-nginx --version 4.8.3
tar -zvxf ingress-nginx-4.8.3.tgz
cd ingress-nginx

# 修改values.yaml
image:
    ## Keep false as default for now!
    chroot: false
    # registry: registry.k8s.io
    repository: 10.1.1.167/registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller
    ## for backwards compatibility consider setting the full image url via the repository value below
    ## use *either* current default registry/image or repository format or installing chart by providing the values.yaml will fail
    ## repository:
    tag: "v1.3.0"
    # digest: sha256:d1707ca76d3b044ab8a28277a2466a02100ee9f58a86af1535a3edf9323ea1b5
    # digestChroot: sha256:0fcb91216a22aae43b374fc2e6a03b8afe9e8c78cbf07a09d75636dc4ea3c191
    pullPolicy: IfNotPresent

  dnsPolicy: ClusterFirstWithHostNet

  hostNetwork: true

  kind: DaemonSet

  nodeSelector:
    kubernetes.io/os: linux
    ingress: "true"

  ipFamilies:
      - IPv4
    ports:
      http: 80
      https: 443
    targetPorts:
      http: http
      https: https
    type: ClusterIP

image:
        # registry: registry.k8s.io
        repository: 10.1.1.167/registry.cn-hangzhou.aliyuncs.com/google_containers/kube-webhook-certgen
        ## for backwards compatibility consider setting the full image url via the repository value below
        ## use *either* current default registry/image or repository format or installing chart by providing the values.yaml will fail
        ## repository:
        tag: v1.1.1
        # digest: sha256:64d8c73dca984af206adf9d6d7e46aa550362b1d7a01f3a0a91b20cc67868660
        pullPolicy: IfNotPresent

# 給每個node節點打標籤
kubectl label node haiotnode01l ingress=true
kubectl get node -L ingress

# 建立名稱空間 
kubectl create ns ingress-nginx

# helm安裝 
helm install ingress-nginx -n ingress-nginx .

# 檢視結果
NAME: ingress-nginx
LAST DEPLOYED: Thu Nov  9 17:30:42 2023
NAMESPACE: ingress-nginx
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status by running 'kubectl --namespace ingress-nginx get services -o wide -w ingress-nginx-controller'

An example Ingress that makes use of the controller:
  apiVersion: networking.k8s.io/v1
  kind: Ingress
  metadata:
    name: example
    namespace: foo
  spec:
    ingressClassName: nginx
    rules:
      - host: www.example.com
        http:
          paths:
            - pathType: Prefix
              backend:
                service:
                  name: exampleService
                  port:
                    number: 80
              path: /
    # This section is only required if TLS is to be enabled for the Ingress
    tls:
      - hosts:
        - www.example.com
        secretName: example-tls

If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:

  apiVersion: v1
  kind: Secret
  metadata:
    name: example-tls
    namespace: foo
  data:
    tls.crt: <base64 encoded cert>
    tls.key: <base64 encoded key>
  type: kubernetes.io/tls

刪除ingress-nginx
helm delete ingress-nginx -n ingress-nginx

查詢ingress-nginx
kubectl get all -n ingress-nginx

[root@HQIOTMASTER07L ~]# kubectl get pod -n ingress-nginx
NAME                             READY   STATUS    RESTARTS   AGE
ingress-nginx-controller-fljbs   1/1     Running   0          7d6h
ingress-nginx-controller-lhn9m   1/1     Running   0          7d6h
ingress-nginx-controller-w76v2   1/1     Running   0          7d6h

九、etcd配置為高可用狀態

# 修改master、master2、master3上的配置檔案etcd.yaml
vi /etc/kubernetes/manifests/etcd.yaml

將
- --initial-cluster=hqiotmaster10l=https://10.1.16.160:2380
修改為
- --initial-cluster=hqiotmaster10l=https://10.1.16.160:2380,hqiotmaster11l=https://10.1.16.161:2380,hqiotmaster12l=https://10.1.16.162:2380

9.1 檢視etcd叢集是否配置成功

# etcdctl下載地址:https://github.com/etcd-io/etcd/releases
cd etcd-v3.5.9-linux-amd64
cp etcd* /usr/local/bin

[root@HQIOTMASTER07L ~]# etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt member list
42cd16c4205e7bee, started, hqiotmaster07l, https://10.1.16.160:2380, https://10.1.16.160:2379, false
bb9be9499c3a8464, started, hqiotmaster09l, https://10.1.16.162:2380, https://10.1.16.162:2379, false
c8761c7050ca479a, started, hqiotmaster08l, https://10.1.16.161:2380, https://10.1.16.161:2379, false

[root@HQIOTMASTER07L ~]# etcdctl -w table --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt --endpoints=https://10.1.16.160:2379,https://10.1.16.161:2379,https://10.1.16.162:2379 endpoint status --cluster
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|         ENDPOINT         |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://10.1.16.160:2379 | 42cd16c4205e7bee |   3.5.9 |   15 MB |     false |      false |        11 |    2905632 |            2905632 |        |
| https://10.1.16.162:2379 | bb9be9499c3a8464 |   3.5.9 |   15 MB |     false |      false |        11 |    2905632 |            2905632 |        |
| https://10.1.16.161:2379 | c8761c7050ca479a |   3.5.9 |   16 MB |      true |      false |        11 |    2905632 |            2905632 |        |
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

十、模擬k8s叢集控制節點故障並快速恢復

問題:K8s 叢集,公司裡有 3 個控制節點和 3 個工作節點,有一個控制節點 master 出問題關機了,修復不成功,然後我們 kubectl delete nodes master 把 master1 移除,移除之後,把機器恢復了,上架了,我打算還這個機器加到 k8s 叢集,還是做控制節點,如何做?
處理方法:https://www.cnblogs.com/yangmeichong/p/16464574.html
# 不管那個版本,命令一樣的

[root@master ~]# etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt member list

[root@master ~]# ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key memrove a2f7e7fa1563203c

10.1 刪除etcd節點

cd /root/etcd-v3.4.13-linux-amd64
cp etcd* /usr/local/bin

# 檢視etcd節點
[root@master etcd-v3.4.13-linux-amd64]# ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key  /etc/kubernetes/pki/etcd/server.key member list
9754d4208fa9e54b, started, master, https://192.168.1.10:2380, https://192.168.1.10:2379, false
b3688cea7fb0bfd6, started, pod1, https://192.168.1.11:2380, https://192.168.1.11:2379, false

# 找到pod1對應的hash值並刪除
[root@master etcd-v3.4.13-linux-amd64]# ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove b3688cea7fb0bfd6
Member b3688cea7fb0bfd6 removed from cluster cbd4e4d0a63d294d

# 檢視
[root@master etcd-v3.4.13-linux-amd64]# ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key  /etc/kubernetes/pki/etcd/server.key member list
9754d4208fa9e54b, started, master, https://192.168.1.10:2380, https://192.168.1.10:2379, false

10.2 etcd節點重新加入k8s

# 1.加入叢集命令:master上執行
[root@master etcd-v3.4.13-linux-amd64]# kubeadm token create --print-join-command
kubeadm join 192.168.1.20:16443 --token 2q0q3r.kmd36rm0vuuc1kcv     --discovery-token-ca-cert-hash sha256:6e220a97f3d79d0b53b5ac18979dcfacdfb5da5ce0629017b745a8a4df162d27

# 2.master 執行:
[root@master etcd-v3.4.13-linux-amd64]# kubectl delete nodes pod1
node "pod1" deleted

# 3.pod1上執行,被刪除etcd的節點上執行
[root@pod1 ~]# kubeadm reset

# 4.將master上kubernetes證書傳到pod1
[root@master pki]# scp ca.crt ca.key sa.key sa.pub front-proxy-ca.crt front-proxy-ca.key pod1:/etc/kubernetes/pki/
ca.crt                                                                                                                                                                                                                                                             100% 1066   498.4KB/s   00:00    
ca.key                                                                                                                                                                                                                                                             100% 1679     1.5MB/s   00:00    
sa.key                                                                                                                                                                                                                                                             100% 1675     1.6MB/s   00:00    
sa.pub                                                                                                                                                                                                                                                             100%  451   553.5KB/s   00:00    
front-proxy-ca.crt                                                                                                                                                                                                                                                 100% 1078     1.1MB/s   00:00    
front-proxy-ca.key

[root@pod1 ~]# cd /etc/kubernetes/pki/

[root@master etcd]# scp ca.crt ca.key pod1:/etc/kubernetes/pki/etcd/
ca.crt                                                                                                                                                                                                                                                             100% 1058   921.3KB/s   00:00    
ca.key  

# 在pod1上執行如下命令,把節點加入k8s叢集,充當控制節點:
[root@pod1 pki]#kubeadm join 192.168.1.20:16443 --token 2q0q3r.kmd36rm0vuuc1kcv --discovery-token-ca-cert-hash sha256:6e220a97f3d79d0b53b5ac18979dcfacdfb5da5ce0629017b745a8a4df162d27 --control-plane

[root@master etcd]# kubectl get nodes
NAME     STATUS   ROLES                  AGE     VERSION
master   Ready    control-plane,master   4d2h    v1.20.7
pod1     Ready    control-plane,master   54s     v1.20.7
pod2     Ready    <none>                 3d14h   v1.20.7

十一、證書延長時間

11.1 檢視證書有效時間

[root@HQIOTMASTER07L ~]# for item in `find /etc/kubernetes/pki -maxdepth 2 -name "*.crt"`;do openssl x509 -in $item -text -noout| grep Not;echo ======================$item===================;done
            Not Before: Jul 25 07:23:27 2024 GMT
            Not After : Jul 23 07:28:27 2034 GMT
======================/etc/kubernetes/pki/ca.crt===================
            Not Before: Jul 30 03:24:26 2024 GMT
            Not After : Jul 28 03:24:26 2034 GMT
======================/etc/kubernetes/pki/apiserver.crt===================
            Not Before: Jul 30 03:24:26 2024 GMT
            Not After : Jul 28 03:24:26 2034 GMT
======================/etc/kubernetes/pki/apiserver-kubelet-client.crt===================
            Not Before: Jul 25 07:23:27 2024 GMT
            Not After : Jul 23 07:28:27 2034 GMT
======================/etc/kubernetes/pki/front-proxy-ca.crt===================
            Not Before: Jul 30 03:24:26 2024 GMT
            Not After : Jul 28 03:24:26 2034 GMT
======================/etc/kubernetes/pki/front-proxy-client.crt===================
            Not Before: Jul 25 07:23:28 2024 GMT
            Not After : Jul 23 07:28:28 2034 GMT
======================/etc/kubernetes/pki/etcd/ca.crt===================
            Not Before: Jul 30 03:24:26 2024 GMT
            Not After : Jul 28 03:24:26 2034 GMT
======================/etc/kubernetes/pki/etcd/server.crt===================
            Not Before: Jul 30 03:24:26 2024 GMT
            Not After : Jul 28 03:24:26 2034 GMT
======================/etc/kubernetes/pki/etcd/peer.crt===================
            Not Before: Jul 30 03:24:26 2024 GMT
            Not After : Jul 28 03:24:26 2034 GMT
======================/etc/kubernetes/pki/etcd/healthcheck-client.crt===================
            Not Before: Jul 30 03:24:26 2024 GMT
            Not After : Jul 28 03:24:26 2034 GMT
======================/etc/kubernetes/pki/apiserver-etcd-client.crt===================

11.2 延長證書指令碼

#指令碼轉載自https://github.com/yuyicai/update-kube-cert
#!/usr/bin/env bash

set -o errexit
set -o pipefail
# set -o xtrace

# set output color
NC='\033[0m'
RED='\033[31m'
GREEN='\033[32m'
YELLOW='\033[33m'
BLUE='\033[34m'
# set default cri
CRI="docker"

log::err() {
  printf "[$(date +'%Y-%m-%dT%H:%M:%S.%2N%z')][${RED}ERROR${NC}] %b\n" "$@"
}

log::info() {
  printf "[$(date +'%Y-%m-%dT%H:%M:%S.%2N%z')][INFO] %b\n" "$@"
}

log::warning() {
  printf "[$(date +'%Y-%m-%dT%H:%M:%S.%2N%z')][${YELLOW}WARNING${NC}] \033[0m%b\n" "$@"
}

check_file() {
  if [[ ! -r ${1} ]]; then
    log::err "can not find ${1}"
    exit 1
  fi
}

# get x509v3 subject alternative name from the old certificate
cert::get_subject_alt_name() {
  local cert=${1}.crt
  local alt_name

  check_file "${cert}"
  alt_name=$(openssl x509 -text -noout -in "${cert}" | grep -A1 'Alternative' | tail -n1 | sed 's/[[:space:]]*Address//g')
  printf "%s\n" "${alt_name}"
}

# get subject from the old certificate
cert::get_subj() {
  local cert=${1}.crt
  local subj

  check_file "${cert}"
  subj=$(openssl x509 -text -noout -in "${cert}" | grep "Subject:" | sed 's/Subject:/\//g;s/\,/\//;s/[[:space:]]//g')
  printf "%s\n" "${subj}"
}

cert::backup_file() {
  local file=${1}
  if [[ ! -e ${file}.old-$(date +%Y%m%d) ]]; then
    cp -rp "${file}" "${file}.old-$(date +%Y%m%d)"
    log::info "backup ${file} to ${file}.old-$(date +%Y%m%d)"
  else
    log::warning "does not backup, ${file}.old-$(date +%Y%m%d) already exists"
  fi
}

# check certificate expiration
cert::check_cert_expiration() {
  local cert=${1}.crt
  local cert_expires

  cert_expires=$(openssl x509 -text -noout -in "${cert}" | awk -F ": " '/Not After/{print$2}')
  printf "%s\n" "${cert_expires}"
}

# check kubeconfig expiration
cert::check_kubeconfig_expiration() {
  local config=${1}.conf
  local cert
  local cert_expires

  cert=$(grep "client-certificate-data" "${config}" | awk '{print$2}' | base64 -d)
  cert_expires=$(openssl x509 -text -noout -in <(printf "%s" "${cert}") | awk -F ": " '/Not After/{print$2}')
  printf "%s\n" "${cert_expires}"
}

# check etcd certificates expiration
cert::check_etcd_certs_expiration() {
  local cert
  local certs

  certs=(
    "${ETCD_CERT_CA}"
    "${ETCD_CERT_SERVER}"
    "${ETCD_CERT_PEER}"
    "${ETCD_CERT_HEALTHCHECK_CLIENT}"
    "${ETCD_CERT_APISERVER_ETCD_CLIENT}"
  )

  for cert in "${certs[@]}"; do
    if [[ ! -r ${cert} ]]; then
      printf "%-50s%-30s\n" "${cert}.crt" "$(cert::check_cert_expiration "${cert}")"
    fi
  done
}

# check master certificates expiration
cert::check_master_certs_expiration() {
  local certs
  local kubeconfs
  local cert
  local conf

  certs=(
    "${CERT_CA}"
    "${CERT_APISERVER}"
    "${CERT_APISERVER_KUBELET_CLIENT}"
    "${FRONT_PROXY_CA}"
    "${FRONT_PROXY_CLIENT}"
  )

  # add support for super_admin.conf, which was added after k8s v1.30.
  if [ -f "${CONF_SUPER_ADMIN}.conf" ]; then
    kubeconfs=(
      "${CONF_CONTROLLER_MANAGER}"
      "${CONF_SCHEDULER}"
      "${CONF_ADMIN}"
      "${CONF_SUPER_ADMIN}"
    )
  else 
    kubeconfs=(
      "${CONF_CONTROLLER_MANAGER}"
      "${CONF_SCHEDULER}"
      "${CONF_ADMIN}"
    )
  fi

  printf "%-50s%-30s\n" "CERTIFICATE" "EXPIRES"

  for conf in "${kubeconfs[@]}"; do
    if [[ ! -r ${conf} ]]; then
      printf "%-50s%-30s\n" "${conf}.config" "$(cert::check_kubeconfig_expiration "${conf}")"
    fi
  done

  for cert in "${certs[@]}"; do
    if [[ ! -r ${cert} ]]; then
      printf "%-50s%-30s\n" "${cert}.crt" "$(cert::check_cert_expiration "${cert}")"
    fi
  done
}

# check all certificates expiration
cert::check_all_expiration() {
  cert::check_master_certs_expiration
  cert::check_etcd_certs_expiration
}

# generate certificate whit client, server or peer
# Args:
#   $1 (the name of certificate)
#   $2 (the type of certificate, must be one of client, server, peer)
#   $3 (the subject of certificates)
#   $4 (the validity of certificates) (days)
#   $5 (the name of ca)
#   $6 (the x509v3 subject alternative name of certificate when the type of certificate is server or peer)
cert::gen_cert() {
  local cert_name=${1}
  local cert_type=${2}
  local subj=${3}
  local cert_days=${4}
  local ca_name=${5}
  local alt_name=${6}
  local ca_cert=${ca_name}.crt
  local ca_key=${ca_name}.key
  local cert=${cert_name}.crt
  local key=${cert_name}.key
  local csr=${cert_name}.csr
  local common_csr_conf='distinguished_name = dn\n[dn]\n[v3_ext]\nkeyUsage = critical, digitalSignature, keyEncipherment\n'

  for file in "${ca_cert}" "${ca_key}" "${cert}" "${key}"; do
    check_file "${file}"
  done

  case "${cert_type}" in
  client)
    csr_conf=$(printf "%bextendedKeyUsage = clientAuth\n" "${common_csr_conf}")
    ;;
  server)
    csr_conf=$(printf "%bextendedKeyUsage = serverAuth\nsubjectAltName = %b\n" "${common_csr_conf}" "${alt_name}")
    ;;
  peer)
    csr_conf=$(printf "%bextendedKeyUsage = serverAuth, clientAuth\nsubjectAltName = %b\n" "${common_csr_conf}" "${alt_name}")
    ;;
  *)
    log::err "unknow, unsupported certs type: ${YELLOW}${cert_type}${NC}, supported type: client, server, peer"
    exit 1
    ;;
  esac

  # gen csr
  openssl req -new -key "${key}" -subj "${subj}" -reqexts v3_ext \
    -config <(printf "%b" "${csr_conf}") \
    -out "${csr}" >/dev/null 2>&1
  # gen cert
  openssl x509 -in "${csr}" -req -CA "${ca_cert}" -CAkey "${ca_key}" -CAcreateserial -extensions v3_ext \
    -extfile <(printf "%b" "${csr_conf}") \
    -days "${cert_days}" -out "${cert}" >/dev/null 2>&1

  rm -f "${csr}"
}

cert::update_kubeconf() {
  local cert_name=${1}
  local kubeconf_file=${cert_name}.conf
  local cert=${cert_name}.crt
  local key=${cert_name}.key
  local subj
  local cert_base64

  check_file "${kubeconf_file}"
  # get the key from the old kubeconf
  grep "client-key-data" "${kubeconf_file}" | awk '{print$2}' | base64 -d >"${key}"
  # get the old certificate from the old kubeconf
  grep "client-certificate-data" "${kubeconf_file}" | awk '{print$2}' | base64 -d >"${cert}"
  # get subject from the old certificate
  subj=$(cert::get_subj "${cert_name}")
  cert::gen_cert "${cert_name}" "client" "${subj}" "${CERT_DAYS}" "${CERT_CA}"
  # get certificate base64 code
  cert_base64=$(base64 -w 0 "${cert}")

  # set certificate base64 code to kubeconf
  sed -i 's/client-certificate-data:.*/client-certificate-data: '"${cert_base64}"'/g' "${kubeconf_file}"

  rm -f "${cert}"
  rm -f "${key}"
}

cert::update_etcd_cert() {
  local subj
  local subject_alt_name
  local cert

  # generate etcd server,peer certificate
  # /etc/kubernetes/pki/etcd/server
  # /etc/kubernetes/pki/etcd/peer
  for cert in ${ETCD_CERT_SERVER} ${ETCD_CERT_PEER}; do
    subj=$(cert::get_subj "${cert}")
    subject_alt_name=$(cert::get_subject_alt_name "${cert}")
    cert::gen_cert "${cert}" "peer" "${subj}" "${CERT_DAYS}" "${ETCD_CERT_CA}" "${subject_alt_name}"
    log::info "${GREEN}updated ${BLUE}${cert}.conf${NC}"
  done

  # generate etcd healthcheck-client,apiserver-etcd-client certificate
  # /etc/kubernetes/pki/etcd/healthcheck-client
  # /etc/kubernetes/pki/apiserver-etcd-client
  for cert in ${ETCD_CERT_HEALTHCHECK_CLIENT} ${ETCD_CERT_APISERVER_ETCD_CLIENT}; do
    subj=$(cert::get_subj "${cert}")
    cert::gen_cert "${cert}" "client" "${subj}" "${CERT_DAYS}" "${ETCD_CERT_CA}"
    log::info "${GREEN}updated ${BLUE}${cert}.conf${NC}"
  done

  # restart etcd
  case $CRI in
    "docker")
      docker ps | awk '/k8s_etcd/{print$1}' | xargs -r -I '{}' docker restart {} >/dev/null 2>&1 || true
      ;;
    "containerd")
      crictl ps | awk '/etcd-/{print$(NF-1)}' | xargs -r -I '{}' crictl stopp {} >/dev/null 2>&1 || true
      ;;
  esac
  log::info "restarted etcd with ${CRI}"
}

cert::update_master_cert() {
  local subj
  local subject_alt_name
  local conf

  # generate apiserver server certificate
  # /etc/kubernetes/pki/apiserver
  subj=$(cert::get_subj "${CERT_APISERVER}")
  subject_alt_name=$(cert::get_subject_alt_name "${CERT_APISERVER}")
  cert::gen_cert "${CERT_APISERVER}" "server" "${subj}" "${CERT_DAYS}" "${CERT_CA}" "${subject_alt_name}"
  log::info "${GREEN}updated ${BLUE}${CERT_APISERVER}.crt${NC}"

  # generate apiserver-kubelet-client certificate
  # /etc/kubernetes/pki/apiserver-kubelet-client
  subj=$(cert::get_subj "${CERT_APISERVER_KUBELET_CLIENT}")
  cert::gen_cert "${CERT_APISERVER_KUBELET_CLIENT}" "client" "${subj}" "${CERT_DAYS}" "${CERT_CA}"
  log::info "${GREEN}updated ${BLUE}${CERT_APISERVER_KUBELET_CLIENT}.crt${NC}"

  # generate kubeconf for controller-manager,scheduler and kubelet
  # /etc/kubernetes/controller-manager,scheduler,admin,kubelet.conf,super_admin(added after k8s v1.30.)

  if [ -f "${CONF_SUPER_ADMIN}.conf" ]; then
    conf_list="${CONF_CONTROLLER_MANAGER} ${CONF_SCHEDULER} ${CONF_ADMIN} ${CONF_KUBELET} ${CONF_SUPER_ADMIN}"
  else 
    conf_list="${CONF_CONTROLLER_MANAGER} ${CONF_SCHEDULER} ${CONF_ADMIN} ${CONF_KUBELET}"
  fi
  
  for conf in ${conf_list}; do
    if [[ ${conf##*/} == "kubelet" ]]; then
      # https://github.com/kubernetes/kubeadm/issues/1753
      set +e
      grep kubelet-client-current.pem /etc/kubernetes/kubelet.conf >/dev/null 2>&1
      kubelet_cert_auto_update=$?
      set -e
      if [[ "$kubelet_cert_auto_update" == "0" ]]; then
        log::info "does not need to update kubelet.conf"
        continue
      fi
    fi

    # update kubeconf
    cert::update_kubeconf "${conf}"
    log::info "${GREEN}updated ${BLUE}${conf}.conf${NC}"

    # copy admin.conf to ${HOME}/.kube/config
    if [[ ${conf##*/} == "admin" ]]; then
      mkdir -p "${HOME}/.kube"
      local config=${HOME}/.kube/config
      local config_backup
      config_backup=${HOME}/.kube/config.old-$(date +%Y%m%d)
      if [[ -f ${config} ]] && [[ ! -f ${config_backup} ]]; then
        cp -fp "${config}" "${config_backup}"
        log::info "backup ${config} to ${config_backup}"
      fi
      cp -fp "${conf}.conf" "${HOME}/.kube/config"
      log::info "copy the admin.conf to ${HOME}/.kube/config"
    fi
  done

  # generate front-proxy-client certificate
  # /etc/kubernetes/pki/front-proxy-client
  subj=$(cert::get_subj "${FRONT_PROXY_CLIENT}")
  cert::gen_cert "${FRONT_PROXY_CLIENT}" "client" "${subj}" "${CERT_DAYS}" "${FRONT_PROXY_CA}"
  log::info "${GREEN}updated ${BLUE}${FRONT_PROXY_CLIENT}.crt${NC}"

  # restart apiserver, controller-manager, scheduler and kubelet
  for item in "apiserver" "controller-manager" "scheduler"; do
    case $CRI in
      "docker")
        docker ps | awk '/k8s_kube-'${item}'/{print$1}' | xargs -r -I '{}' docker restart {} >/dev/null 2>&1 || true
        ;;
      "containerd")
        crictl ps | awk '/kube-'${item}'-/{print $(NF-1)}' | xargs -r -I '{}' crictl stopp {} >/dev/null 2>&1 || true
        ;;
    esac
    log::info "restarted ${item} with ${CRI}"
  done
  systemctl restart kubelet || true
  log::info "restarted kubelet"
}

main() {
  local node_type=$1

  # read the options
  ARGS=`getopt -o c: --long cri: -- "$@"`
  eval set -- "$ARGS"
  # extract options and their arguments into variables.
  while true
  do
    case "$1" in
      -c|--cri)
        case "$2" in
          "docker"|"containerd")
            CRI=$2
            shift 2
            ;;
          *)
            echo 'Unsupported cri. Valid options are "docker", "containerd".'
            exit 1
            ;;
        esac
        ;;
      --)
        shift
        break
        ;;
      *)
        echo "Invalid arguments."
        exit 1
        ;;
    esac
  done

  CERT_DAYS=3650

  KUBE_PATH=/etc/kubernetes
  PKI_PATH=${KUBE_PATH}/pki

  # master certificates path
  # apiserver
  CERT_CA=${PKI_PATH}/ca
  CERT_APISERVER=${PKI_PATH}/apiserver
  CERT_APISERVER_KUBELET_CLIENT=${PKI_PATH}/apiserver-kubelet-client
  CONF_CONTROLLER_MANAGER=${KUBE_PATH}/controller-manager
  CONF_SCHEDULER=${KUBE_PATH}/scheduler
  CONF_ADMIN=${KUBE_PATH}/admin
  CONF_SUPER_ADMIN=${KUBE_PATH}/super-admin
  CONF_KUBELET=${KUBE_PATH}/kubelet
  # front-proxy
  FRONT_PROXY_CA=${PKI_PATH}/front-proxy-ca
  FRONT_PROXY_CLIENT=${PKI_PATH}/front-proxy-client

  # etcd certificates path
  ETCD_CERT_CA=${PKI_PATH}/etcd/ca
  ETCD_CERT_SERVER=${PKI_PATH}/etcd/server
  ETCD_CERT_PEER=${PKI_PATH}/etcd/peer
  ETCD_CERT_HEALTHCHECK_CLIENT=${PKI_PATH}/etcd/healthcheck-client
  ETCD_CERT_APISERVER_ETCD_CLIENT=${PKI_PATH}/apiserver-etcd-client

  case ${node_type} in
  # etcd)
  # # update etcd certificates
  #   cert::update_etcd_cert
  # ;;
  master)
    # check certificates expiration
    cert::check_master_certs_expiration
    # backup $KUBE_PATH to $KUBE_PATH.old-$(date +%Y%m%d)
    cert::backup_file "${KUBE_PATH}"
    # update master certificates and kubeconf
    log::info "${GREEN}updating...${NC}"
    cert::update_master_cert
    log::info "${GREEN}done!!!${NC}"
    # check certificates expiration after certificates updated
    cert::check_master_certs_expiration
    ;;
  all)
    # check certificates expiration
    cert::check_all_expiration
    # backup $KUBE_PATH to $KUBE_PATH.old-$(date +%Y%m%d)
    cert::backup_file "${KUBE_PATH}"
    # update etcd certificates
    log::info "${GREEN}updating...${NC}"
    cert::update_etcd_cert
    # update master certificates and kubeconf
    cert::update_master_cert
    log::info "${GREEN}done!!!${NC}"
    # check certificates expiration after certificates updated
    cert::check_all_expiration
    ;;
  check)
    # check certificates expiration
    cert::check_all_expiration
    ;;
  *)
    log::err "unknown, unsupported cert type: ${node_type}, supported type: \"all\", \"master\""
    printf "Documentation: https://github.com/yuyicai/update-kube-cert
  example:
    '\033[32m./update-kubeadm-cert.sh all\033[0m' update all etcd certificates, master certificates and kubeconf
      /etc/kubernetes
      ├── admin.conf
      ├── super-admin.conf
      ├── controller-manager.conf
      ├── scheduler.conf
      ├── kubelet.conf
      └── pki
          ├── apiserver.crt
          ├── apiserver-etcd-client.crt
          ├── apiserver-kubelet-client.crt
          ├── front-proxy-client.crt
          └── etcd
              ├── healthcheck-client.crt
              ├── peer.crt
              └── server.crt

    '\033[32m./update-kubeadm-cert.sh master\033[0m' update only master certificates and kubeconf
      /etc/kubernetes
      ├── admin.conf
      ├── super-admin.conf
      ├── controller-manager.conf
      ├── scheduler.conf
      ├── kubelet.conf
      └── pki
          ├── apiserver.crt
          ├── apiserver-kubelet-client.crt
          └── front-proxy-client.crt
"
    exit 1
    ;;
  esac
}

main "$@"

11.3 執行指令碼

下面操作都是在每個 master 中執行,
cd /etc/yum.repos.d/

上傳檔案  
update-kubeadm-cert.sh
chmod 755 update-kubeadm-cert.sh

更新證書
./update-kubeadm-cert.sh all --cri containerd

相關文章