CEPH叢集重啟後ceph osd status和ceph -s命令檢視發現節點伺服器osd服務不成功的解決方案
CEPH叢集重啟後ceph osd status和ceph -s命令檢視發現節點伺服器osd服務不成功的解決方案
前言
- 實驗環境部署ceph叢集(結合openstack多節點),重啟之後ceph叢集的osd服務出現問題,解決如下
- ceph+openstack多節點部署,有興趣的可以檢視我另一篇部落格:https://blog.csdn.net/CN_TangZheng/article/details/104745364
一:報錯
[root@ct ~]# ceph -s '//檢視ceph叢集狀態'
cluster:
id: 8c9d2d27-492b-48a4-beb6-7de453cf45d6
health: HEALTH_WARN '//健康檢查為warn'
1 osds down
1 host (1 osds) down
Reduced data availability: 192 pgs inactive
Degraded data redundancy: 812/1218 objects degraded (66.667%), 116 pgs degraded, 192 pgs undersized
clock skew detected on mon.c1, mon.c2
services:
mon: 3 daemons, quorum ct,c1,c2
mgr: ct(active), standbys: c1, c2
osd: 3 osds: 1 up, 2 in '//二個OSD服務宕了'
data:
pools: 3 pools, 192 pgs
objects: 406 objects, 1.8 GiB
usage: 2.8 GiB used, 1021 GiB / 1024 GiB avail
pgs: 100.000% pgs not active
812/1218 objects degraded (66.667%)
116 undersized+degraded+peered
76 undersized+peered
[root@ct ~]# ceph osd status '//檢視osd服務狀態,發現兩個計算節點的osd服務狀態不正常'
+----+------+-------+-------+--------+---------+--------+---------+----------------+
| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |
+----+------+-------+-------+--------+---------+--------+---------+----------------+
| 0 | ct | 2837M | 1021G | 0 | 0 | 0 | 0 | exists,up |
| 1 | | 0 | 0 | 0 | 0 | 0 | 0 | exists |
| 2 | | 0 | 0 | 0 | 0 | 0 | 0 | autoout,exists |
+----+------+-------+-------+--------+---------+--------+---------+----------------+
1.1:解決
- 發現neutron的Open vSwitch服務掛了
[root@ct ~]# source keystonerc_admin
[root@ct ~(keystone_admin)]# openstack network agent list '//經過排查,發現Open vSwitch和L3服務掛掉了'
+--------------------------------------+----------------------+------+-------------------+-------+-------+---------------------------+
| ID | Agent Type | Host | Availability Zone | Alive | State | Binary |
+--------------------------------------+----------------------+------+-------------------+-------+-------+---------------------------+
| 12dd5b51-1344-4c29-8974-e5d8e0e65d2e | Open vSwitch agent | c1 | None | XXX | UP | neutron-openvswitch-agent |
| 20829a10-4a26-4317-8175-4534ac0b01e1 | Open vSwitch agent | c2 | None | XXX | UP | neutron-openvswitch-agent |
| 25c121ec-b761-4e7b-bfbf-9601993ebb54 | Metadata agent | ct | None | :-) | UP | neutron-metadata-agent |
| 47c878ee-93f0-4960-baa1-1cc92476ed2a | DHCP agent | ct | nova | :-) | UP | neutron-dhcp-agent |
| 57647383-7106-46b6-971f-2398457e5179 | Loadbalancerv2 agent | ct | None | :-) | UP | neutron-lbaasv2-agent |
| 92d49052-0b4f-467c-a92c-1743d891043f | Metering agent | ct | None | :-) | UP | neutron-metering-agent |
| c2f7791c-96ed-472b-abda-509a3ff125b5 | L3 agent | ct | nova | XXX | UP | neutron-l3-agent |
| e48269d8-e4f1-424b-bc3e-4c0d13757e8a | Open vSwitch agent | ct | None | :-) | UP | neutron-openvswitch-agent |
+--------------------------------------+----------------------+------+-------------------+-------+-------+---------------------------+
- 控制節點重啟l3服務
[root@ct ~(keystone_admin)]# systemctl start neutron-l3-agent
- 計算節點重啟Open vSwitch agent
[root@c1 ceph]# systemctl restart neutron-openvswitch-agent
[root@c2 ceph]# systemctl restart neutron-openvswitch-agent
- 重啟完成後再次檢視
openstack network agent list
服務是否都正常開啟 - 我們進入計算節點重啟osd,
[root@c1 ceph]# systemctl restart ceph-osd.target
[root@c2 ceph]# systemctl restart ceph-osd.target
[root@c1 ceph]# systemctl restart ceph-mgr.target
[root@c2 ceph]# systemctl restart ceph-mgr.target
'//重啟OSD服務後使用ceph -s命令檢視ceph叢集狀態,若計算節點的mgr服務沒有開啟也需要重啟一下'
1.2:再次檢查,發現問題已經解決
[root@ct ~(keystone_admin)]# ceph -s
cluster:
id: 8c9d2d27-492b-48a4-beb6-7de453cf45d6
health: HEALTH_OK '//健康檢查OK'
services: '//下面的服務也都正常了'
mon: 3 daemons, quorum ct,c1,c2
mgr: ct(active), standbys: c2, c1
osd: 3 osds: 3 up, 3 in
data:
pools: 3 pools, 192 pgs
objects: 406 objects, 1.8 GiB
usage: 8.3 GiB used, 3.0 TiB / 3.0 TiB avail
pgs: 192 active+clean
io:
client: 1.5 KiB/s rd, 1 op/s rd, 0 op/s wr
[root@ct ~(keystone_admin)]# ceph osd status '//OSD狀態也都沒問題'
+----+------+-------+-------+--------+---------+--------+---------+-----------+
| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |
+----+------+-------+-------+--------+---------+--------+---------+-----------+
| 0 | ct | 2837M | 1021G | 0 | 0 | 0 | 0 | exists,up |
| 1 | c1 | 2837M | 1021G | 0 | 0 | 0 | 0 | exists,up |
| 2 | c2 | 2837M | 1021G | 0 | 0 | 0 | 0 | exists,up |
+----+------+-------+-------+--------+---------+--------+---------+-----------+
| 0 | 0 | exists,up |
±—±-----±------±------±-------±--------±-------±--------±----------+
相關文章
- ceph叢集的OSD管理基礎及OSD節點擴縮容
- ceph叢集
- ceph-deploy離線部署ceph叢集及報錯解決FAQ
- docker筆記40-ceph osd誤刪除恢復Docker筆記
- Ceph的叢集全部換IP
- Ceph提供nfs服務NFS
- ceph叢集常用命令精華彙總!
- CEPH-2:rbd功能詳解及普通使用者應用ceph叢集
- ceph叢集安裝報錯解決方法
- 為K8S叢集準備Ceph儲存K8S
- cephadm訪問ceph叢集的方式及管理員節點配置案例
- Ceph分散式叢集部署分散式
- 踩坑日誌--CEPH叢集常見問題解決辦法
- docker筆記41-ceph叢集的日常運維Docker筆記運維
- 手把手教你使用rpm部署ceph叢集
- Ceph分散式儲存叢集-硬體選擇分散式
- ceph 叢集報 mds cluster is degraded 故障排查薦
- CEPH-4:ceph RadowGW物件儲存功能詳解物件
- Ceph 和防火牆的故事防火牆
- Ceph df分析
- Ceph核心元件元件
- Ceph介紹
- ceph_deploy部署ceph分散式檔案系統分散式
- Ceph 磁碟損壞現象和解決方法
- Ceph Reef(18.2.X)的內建Prometheus監控叢集Prometheus
- ceph:忘記 甚至 從ceph裡刪除了 ceph.client.admin密碼,怎麼處理?client密碼
- kubernets1.13.1叢集使用ceph rbd塊儲存
- Ceph心跳機制
- ceph安裝配置
- glance對接ceph
- Ceph入門教程
- Ceph MDS States狀態詳解
- CEPH-3:cephfs功能詳解
- Ceph的正確玩法之Ceph糾刪碼理論與實踐
- How OpenStack integrates with Ceph?
- ceph-immmutable-object-cacheObject
- 深入理解 ceph mgr
- Ceph配置引數分析