作者:尹珉
Rook 介紹
Rook 是一個開源的雲原生儲存編排器,為各種儲存解決方案提供平臺、框架和支援,以便與雲原生環境進行原生整合。
Rook 將分散式儲存系統轉變為自管理、自擴充套件、自修復的儲存服務。它使儲存管理員的部署、引導、配置、配置、擴充套件、升級、遷移、災難恢復、監控和資源管理等任務自動化。
簡而言之,Rook 就是一組 Kubernetes 的 Operator,它可以完全控制多種資料儲存解決方案(例如 Ceph、EdgeFS、Minio、Cassandra)的部署,管理以及自動恢復。
到目前為止,Rook 支援的最穩定的儲存仍然是 Ceph,本文將介紹如何使用 Rook 來建立維護 Ceph 叢集,並作為 Kubernetes 的持久化儲存。
環境準備
K8s 環境可以通過安裝 KubeSphere 進行部署,我使用的是高可用方案。
在公有云上安裝 KubeSphere 參考文件:多節點安裝
⚠️ 注意:kube-node(5,6,7)的節點上分別有兩塊資料盤。
kube-master1 Ready master 118d v1.17.9
kube-master2 Ready master 118d v1.17.9
kube-master3 Ready master 118d v1.17.9
kube-node1 Ready worker 118d v1.17.9
kube-node2 Ready worker 118d v1.17.9
kube-node3 Ready worker 111d v1.17.9
kube-node4 Ready worker 111d v1.17.9
kube-node5 Ready worker 11d v1.17.9
kube-node6 Ready worker 11d v1.17.9
kube-node7 Ready worker 11d v1.17.9
安裝前請確保 node 節點都安裝上了 lvm2,否則會報錯。
部署安裝 Rook、Ceph 叢集
1.克隆 Rook 倉庫到本地
$ git clone -b release-1.4 https://github.com/rook/rook.git
2.切換目錄
$ cd /root/ceph/rook/cluster/examples/kubernetes/ceph
3.部署 Rook,建立 CRD 資源
$ kubectl create -f common.yaml -f operator.yaml
# 說明:
# 1.comm.yaml裡面主要是許可權控制以及CRD資源定義
# 2.operator.yaml是rook-ceph-operator的deloyment
4.建立 Ceph 叢集
$ kubectl create -f cluster.yaml
# 重要說明:
# 演示不做定製化操作,Ceph叢集預設會動態去識別node節點上未格式化的全新空閒硬碟,自動會對這些盤進行OSD初始化(至少是需要3個節點,每個節點至少一塊空閒硬碟)
5.檢查 pod 狀態
$ kubectl get pod -n rook-ceph -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
csi-cephfsplugin-5fw92 3/3 Running 6 12d 192.168.0.31 kube-node7 <none> <none>
csi-cephfsplugin-78plf 3/3 Running 0 12d 192.168.0.134 kube-node1 <none> <none>
csi-cephfsplugin-bkdl8 3/3 Running 3 12d 192.168.0.195 kube-node5 <none> <none>
csi-cephfsplugin-provisioner-77f457bcb9-6w4cv 6/6 Running 0 12d 10.233.77.95 kube-node4 <none> <none>
csi-cephfsplugin-provisioner-77f457bcb9-q7vxh 6/6 Running 0 12d 10.233.76.156 kube-node3 <none> <none>
csi-cephfsplugin-rqb4d 3/3 Running 0 12d 192.168.0.183 kube-node4 <none> <none>
csi-cephfsplugin-vmrfj 3/3 Running 0 12d 192.168.0.91 kube-node3 <none> <none>
csi-cephfsplugin-wglsw 3/3 Running 3 12d 192.168.0.116 kube-node6 <none> <none>
csi-rbdplugin-4m8hv 3/3 Running 0 12d 192.168.0.91 kube-node3 <none> <none>
csi-rbdplugin-7wt45 3/3 Running 3 12d 192.168.0.195 kube-node5 <none> <none>
csi-rbdplugin-bn5pn 3/3 Running 3 12d 192.168.0.116 kube-node6 <none> <none>
csi-rbdplugin-hwl4b 3/3 Running 6 12d 192.168.0.31 kube-node7 <none> <none>
csi-rbdplugin-provisioner-7897f5855-7m95p 6/6 Running 0 12d 10.233.77.94 kube-node4 <none> <none>
csi-rbdplugin-provisioner-7897f5855-btwt5 6/6 Running 0 12d 10.233.76.155 kube-node3 <none> <none>
csi-rbdplugin-qvksp 3/3 Running 0 12d 192.168.0.183 kube-node4 <none> <none>
csi-rbdplugin-rr296 3/3 Running 0 12d 192.168.0.134 kube-node1 <none> <none>
rook-ceph-crashcollector-kube-node1-64cf6f49fb-bx8lz 1/1 Running 0 12d 10.233.101.46 kube-node1 <none> <none>
rook-ceph-crashcollector-kube-node3-575b75dc64-gxwtp 1/1 Running 0 12d 10.233.76.149 kube-node3 <none> <none>
rook-ceph-crashcollector-kube-node4-78549d6d7f-9zz5q 1/1 Running 0 8d 10.233.77.226 kube-node4 <none> <none>
rook-ceph-crashcollector-kube-node5-5db8557476-b8zp6 1/1 Running 1 11d 10.233.81.239 kube-node5 <none> <none>
rook-ceph-crashcollector-kube-node6-78b7946769-8qh45 1/1 Running 0 8d 10.233.66.252 kube-node6 <none> <none>
rook-ceph-crashcollector-kube-node7-78c97898fd-k85l4 1/1 Running 1 8d 10.233.111.33 kube-node7 <none> <none>
rook-ceph-mds-myfs-a-86bdb684b6-4pbj7 1/1 Running 0 8d 10.233.77.225 kube-node4 <none> <none>
rook-ceph-mds-myfs-b-6697d66b7d-jgnkw 1/1 Running 0 8d 10.233.66.250 kube-node6 <none> <none>
rook-ceph-mgr-a-658db99d5b-jbrzh 1/1 Running 0 12d 10.233.76.162 kube-node3 <none> <none>
rook-ceph-mon-a-5cbf5947d8-vvfgf 1/1 Running 1 12d 10.233.101.44 kube-node1 <none> <none>
rook-ceph-mon-b-6495c96d9d-b82st 1/1 Running 0 12d 10.233.76.144 kube-node3 <none> <none>
rook-ceph-mon-d-dc4c6f4f9-rdfpg 1/1 Running 1 12d 10.233.66.219 kube-node6 <none> <none>
rook-ceph-operator-56fc54bb77-9rswg 1/1 Running 0 12d 10.233.76.138 kube-node3 <none> <none>
rook-ceph-osd-0-777979f6b4-jxtg9 1/1 Running 1 11d 10.233.81.237 kube-node5 <none> <none>
rook-ceph-osd-10-589487764d-8bmpd 1/1 Running 0 8d 10.233.111.59 kube-node7 <none> <none>
rook-ceph-osd-11-5b7dd4c7bc-m4nqz 1/1 Running 0 8d 10.233.111.60 kube-node7 <none> <none>
rook-ceph-osd-2-54cbf4d9d8-qn4z7 1/1 Running 1 10d 10.233.66.222 kube-node6 <none> <none>
rook-ceph-osd-6-c94cd566-ndgzd 1/1 Running 1 10d 10.233.81.238 kube-node5 <none> <none>
rook-ceph-osd-7-d8cdc94fd-v2lm8 1/1 Running 0 9d 10.233.66.223 kube-node6 <none> <none>
rook-ceph-osd-prepare-kube-node1-4bdch 0/1 Completed 0 66m 10.233.101.91 kube-node1 <none> <none>
rook-ceph-osd-prepare-kube-node3-bg4wk 0/1 Completed 0 66m 10.233.76.252 kube-node3 <none> <none>
rook-ceph-osd-prepare-kube-node4-r9dk4 0/1 Completed 0 66m 10.233.77.107 kube-node4 <none> <none>
rook-ceph-osd-prepare-kube-node5-rbvcn 0/1 Completed 0 66m 10.233.81.73 kube-node5 <none> <none>
rook-ceph-osd-prepare-kube-node5-rcngg 0/1 Completed 5 10d 10.233.81.98 kube-node5 <none> <none>
rook-ceph-osd-prepare-kube-node6-jc8cm 0/1 Completed 0 66m 10.233.66.109 kube-node6 <none> <none>
rook-ceph-osd-prepare-kube-node6-qsxrp 0/1 Completed 0 11d 10.233.66.109 kube-node6 <none> <none>
rook-ceph-osd-prepare-kube-node7-5c52p 0/1 Completed 5 8d 10.233.111.58 kube-node7 <none> <none>
rook-ceph-osd-prepare-kube-node7-h5d6c 0/1 Completed 0 66m 10.233.111.110 kube-node7 <none> <none>
rook-ceph-osd-prepare-kube-node7-tzvp5 0/1 Completed 0 11d 10.233.111.102 kube-node7 <none> <none>
rook-ceph-osd-prepare-kube-node7-wd6dt 0/1 Completed 7 8d 10.233.111.56 kube-node7 <none> <none>
rook-ceph-tools-64fc489556-5clvj 1/1 Running 0 12d 10.233.77.118 kube-node4 <none> <none>
rook-discover-6kbvg 1/1 Running 0 12d 10.233.101.42 kube-node1 <none> <none>
rook-discover-7dr44 1/1 Running 2 12d 10.233.66.220 kube-node6 <none> <none>
rook-discover-dqr82 1/1 Running 0 12d 10.233.77.74 kube-node4 <none> <none>
rook-discover-gqppp 1/1 Running 0 12d 10.233.76.139 kube-node3 <none> <none>
rook-discover-hdkxf 1/1 Running 1 12d 10.233.81.236 kube-node5 <none> <none>
rook-discover-pzhsw 1/1 Running 3 12d 10.233.111.36 kube-node7 <none> <none>
以上是所有元件的 pod 完成後的狀態,其中 rook-ceph-osd-prepare 開頭的 pod 是自動感知叢集新掛載硬碟的,只要有新硬碟掛載到叢集自動會觸發 OSD。
6.配置 Ceph 叢集 dashboard
Ceph Dashboard 是一個內建的基於 Web 的管理和監視應用程式,它是開源 Ceph 發行版的一部分。通過 Dashboard 可以獲取 Ceph 叢集的各種基本狀態資訊。
預設的 ceph 已經安裝的 ceph-dashboard,其 SVC 地址是 service clusterIP,並不能被外部訪問,需要建立 service 服務
$ kubectl apply -f dashboard-external-http.yaml
apiVersion: v1
kind: Service
metadata:
name: rook-ceph-mgr-dashboard-external-https
namespace: rook-ceph # namespace:cluster
labels:
app: rook-ceph-mgr
rook_cluster: rook-ceph # namespace:cluster
spec:
ports:
- name: dashboard
port: 7000
protocol: TCP
targetPort: 7000
selector:
app: rook-ceph-mgr
rook_cluster: rook-ceph
sessionAffinity: None
type: NodePort
說明:由於 8443 是 https 訪問埠需要配置證書,本教程只展示 http 訪問 port 上只配置了 7000
7.檢視 svc 狀態
$ kubectl get svc -n rook-ceph
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
csi-cephfsplugin-metrics ClusterIP 10.233.3.172 <none> 8080/TCP,8081/TCP 12d
csi-rbdplugin-metrics ClusterIP 10.233.43.23 <none> 8080/TCP,8081/TCP 12d
rook-ceph-mgr ClusterIP 10.233.63.85 <none> 9283/TCP 12d
rook-ceph-mgr-dashboard ClusterIP 10.233.20.159 <none> 7000/TCP 12d
rook-ceph-mgr-dashboard-external-https NodePort 10.233.56.73 <none> 7000:31357/TCP 12d
rook-ceph-mon-a ClusterIP 10.233.30.222 <none> 6789/TCP,3300/TCP 12d
rook-ceph-mon-b ClusterIP 10.233.55.25 <none> 6789/TCP,3300/TCP 12d
rook-ceph-mon-d ClusterIP 10.233.0.206 <none> 6789/TCP,3300/TCP 12d
8.驗證訪問 dashboard
開啟 KubeSphere 平臺開啟外網服務
訪問方式:
http://{master1-ip:31357}
使用者名稱獲取方法:
$ kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}"|base64 --decode && echo
說明:dashboard 顯示 HEALTH_WARN 警告可以通過 seelog 的方式檢視具體的原因,一般是 osd down、pg 數量不夠等
9.部署 rook 工具箱
Rook 工具箱是一個包含用於 Rook 除錯和測試的常用工具的容器
$ kubectl apply -f toolbox.yaml
進入工具箱檢視 Ceph 叢集狀態
$ kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') -- bash
$ ceph -s
cluster:
id: 1457045a-4926-411f-8be8-c7a958351a38
health: HEALTH_WARN
mon a is low on available space
2 osds down
Degraded data redundancy: 25/159 objects degraded (15.723%), 16 pgs degraded, 51 pgs undersized
3 daemons have recently crashed
services:
mon: 3 daemons, quorum a,b,d (age 9d)
mgr: a(active, since 4h)
mds: myfs:1 {0=myfs-b=up:active} 1 up:standby-replay
osd: 12 osds: 6 up (since 8d), 8 in (since 8d); 9 remapped pgs
data:
pools: 5 pools, 129 pgs
objects: 53 objects, 37 MiB
usage: 6.8 GiB used, 293 GiB / 300 GiB avail
pgs: 25/159 objects degraded (15.723%)
5/159 objects misplaced (3.145%)
69 active+clean
35 active+undersized
16 active+undersized+degraded
9 active+clean+remapped
工具箱相關查詢命令
ceph status
ceph osd status
ceph df
rados df
部署 StorageClass
1.rbd 塊儲存簡介
Ceph 可以同時提供物件儲存 RADOSGW、塊儲存 RBD、檔案系統儲存 Ceph FS。 RBD 即 RADOS Block Device 的簡稱,RBD 塊儲存是最穩定且最常用的儲存型別。RBD 塊裝置類似磁碟可以被掛載。 RBD 塊裝置具有快照、多副本、克隆和一致性等特性,資料以條帶化的方式儲存在 Ceph 叢集的多個 OSD 中。
2.建立 StorageClass
[root@kube-master1 rbd]# kubectl apply -f storageclass.yaml
3.檢視 StorageClass 部署狀態
4.建立 pvc
$ kubectl apply -f pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
storageClassName: rook-ceph-block
~
5.建立帶有 pvc 的 pod
$ kubectl apply -f pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: csirbd-demo-pod
spec:
containers:
- name: web-server
image: nginx
volumeMounts:
- name: mypvc
mountPath: /var/lib/www/html
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: rbd-pvc
readOnly: false
6.檢視 pod、pvc、pv 狀態
總結
對於首次接觸 rook+Ceph 部署體驗的同學來說需要了解的內容較多,遇到的坑也會比較的多。希望通過以上的部署過程記錄可以幫助到大家。
1.Ceph 叢集一直提示沒有可 osd 的盤
答:這裡遇到過幾個情況,檢視下掛載的資料盤是不是以前已經使用過雖然格式化了但是以前的 raid 資訊還存在?可以使用一下指令碼進行清理後在格式化在進行掛載。
#!/usr/bin/env bash
DISK="/dev/vdc" #按需修改自己的碟符資訊
# Zap the disk to a fresh, usable state (zap-all is important, b/c MBR has to be clean)
# You will have to run this step for all disks.
sgdisk --zap-all $DISK
# Clean hdds with dd
dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync
# Clean disks such as ssd with blkdiscard instead of dd
blkdiscard $DISK
# These steps only have to be run once on each node
# If rook sets up osds using ceph-volume, teardown leaves some devices mapped that lock the disks.
ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %
# ceph-volume setup can leave ceph-<UUID> directories in /dev and /dev/mapper (unnecessary clutter)
rm -rf /dev/ceph-*
rm -rf /dev/mapper/ceph--*
# Inform the OS of partition table changes
partprobe $DISK
~
2.Ceph 支援哪些儲存型別?
答:rdb 塊儲存、cephfs 檔案儲存、s3 物件儲存等
3.部署中出現各種坑應該怎麼排查?
答:強烈建議通過 rook、ceph 官網去檢視相關文件進行排錯
4.訪問 dashboard 失敗
答:如果是公有云搭建的 KubeSphere 或 K8s 請把 nodeport 埠在安全組裡放行即可
本文由部落格一文多發平臺 OpenWrite 釋出!