Centos7下GlusterFS分散式儲存叢集環境部署記錄

散盡浮華發表於2018-04-08

原文網址 : http://www.cnblogs.com/kevingrace/p/8743812.html

之前已經簡單地對GlusterFS分散式檔案系統做了介紹，下面就該環境部署做一記錄：

0）環境準備

GlusterFS至少需要兩臺伺服器搭建，伺服器配置最好相同，每個伺服器兩塊磁碟，一塊是用於安裝系統，一塊是用於GlusterFS。

192.168.10.239    GlusterFS-master（主節點）    Centos7.4
192.168.10.212    GlusterFS-slave （從節點）    Centos7.4
192.168.10.213    Client          （客戶端）
----------------------------------------------------------------------------------------

由於GlusterFS需要使用網路，因此還必須事先根據環境設定防火牆規則，關閉SELinux。
這裡我將上面三臺伺服器的防火牆和Selinux全部關閉
[root@GlusterFS-master ~]# setenforce 0
[root@GlusterFS-master ~]# getenforce 
[root@GlusterFS-master ~]# cat /etc/sysconfig/selinux |grep "SELINUX=disabled"
SELINUX=disabled

[root@GlusterFS-master ~]# systemctl stop firewalld
[root@GlusterFS-master ~]# systemctl disable firewalld
[root@GlusterFS-master ~]# firewall-cmd --state
not running

------------------------------------------------------------------------------------------
由於GlusterFS並沒有伺服器與後設資料等概念，因此所有伺服器的設定都相同。首先要做主機名的設定（如果操作時都用ip地址，不使用主機名，那麼就不需要做hosts繫結）：
[root@GlusterFS-master ~]# hostnamectl --static set-hostname GlusterFS-master
[root@GlusterFS-master ~]# cat /etc/hostname
GlusterFS-master
[root@GlusterFS-master ~]# vim /etc/hosts
.....
192.168.10.239  GlusterFS-master
192.168.10.212  GlusterFS-slave

[root@GlusterFS-slave ~]# hostnamectl --static set-hostname GlusterFS-slave
[root@GlusterFS-slave ~]# cat /etc/hostname 
GlusterFS-slave
[root@GlusterFS-slave ~]# vim /etc/hosts
......
192.168.10.239  GlusterFS-master
192.168.10.212  GlusterFS-slave

------------------------------------------------------------------------------------------
時鐘同步
這個問題是叢集內部的時間非常重要，如果伺服器間的時間有誤差，可能會給叢集間的通訊帶來麻煩，
進而導致叢集失效。這裡採用網路同步時鐘的方法，確保兩臺伺服器的時間一致（時區和時間都要標準、一致）：
[root@GlusterFS-master ~]# yum install -y ntpdate
[root@GlusterFS-master ~]# ntpdate ntp1.aliyun.com 
[root@GlusterFS-master ~]# date

[root@GlusterFS-slave ~]# yum install -y ntpdate
[root@GlusterFS-slave ~]# ntpdate ntp1.aliyun.com
[root@GlusterFS-slave ~]# date

1）安裝依賴（在GlusterFS-master和GlusterFS-slave兩臺機器上都要操作）

[root@GlusterFS-master ~]# yum install -y flex bison openssl openssl-devel acl libacl libacl-devel sqlite-devel libxml2-devel python-devel make cmake gcc gcc-c++ autoconf automake libtool unzip zip

2）安裝userspace-rcu-master和userspace-rcu-master（在GlusterFS-master和GlusterFS-slave兩臺機器上都要操作）

1）下載glusterfs-3.6.9.tar.gz和userspace-rcu-master.zip
百度雲盤下載地址：https://pan.baidu.com/s/1DyKxt0TnO3aNx59mVfJCZA
提取密碼：ywq8

將這兩個安裝包放到/usr/local/src目錄下
[root@GlusterFS-master ~]# cd /usr/local/src/
[root@GlusterFS-master src]# ll
total 6444
-rw-r--r--. 1 root root 6106554 Feb 29  2016 glusterfs-3.6.9.tar.gz
-rw-r--r--. 1 root root  490091 Apr  8 09:58 userspace-rcu-master.zip

2）安裝userspace-rcu-master
[root@GlusterFS-master src]# unzip /usr/local/src/userspace-rcu-master.zip -d /usr/local/
[root@GlusterFS-master src]# cd /usr/local/userspace-rcu-master/
[root@GlusterFS-master userspace-rcu-master]# ./bootstrap
[root@GlusterFS-master userspace-rcu-master]# ./configure
[root@GlusterFS-master userspace-rcu-master]# make && make install
[root@GlusterFS-master userspace-rcu-master]# ldconfig

3）安裝userspace-rcu-master
[root@GlusterFS-master userspace-rcu-master]# tar -zxvf /usr/local/src/glusterfs-3.6.9.tar.gz -C /usr/local/
[root@GlusterFS-master userspace-rcu-master]# cd /usr/local/glusterfs-3.6.9/
[root@GlusterFS-master glusterfs-3.6.9]# ./configure --prefix=/usr/local/glusterfs
[root@GlusterFS-master glusterfs-3.6.9]# make && make install

新增環境變數
[root@GlusterFS-master glusterfs-3.6.9]# vim /etc/profile       //在檔案最底部新增如下內容
......
export GLUSTERFS_HOME=/usr/local/glusterfs
export PATH=$PATH:$GLUSTERFS_HOME/sbin

[root@GlusterFS-master glusterfs-3.6.9]# source /etc/profile

4）啟動glusterfs
[root@GlusterFS-master ~]# /usr/local/glusterfs/sbin/glusterd
[root@GlusterFS-master ~]# ps -ef|grep glusterd
root       852     1  0 10:14 ?        00:00:00 /usr/local/glusterfs/sbin/glusterd
root       984 26217  0 10:14 pts/1    00:00:00 grep --color=auto glusterd
[root@GlusterFS-master ~]# lsof -i:24007
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
glusterd 852 root    9u  IPv4 123605      0t0  TCP *:24007 (LISTEN)

3）建立GlusterFS分散式儲存叢集（這裡選擇在GlusterFS-master上操作。其實在任意一個節點上操作都可以）

1）執行以下命令，將192.168.10.212（可以使用ip地址，也可以使用節點的主機名）節點加入到叢集，有多少個節點需要加入叢集，就執行多少個下面的命令：
[root@GlusterFS-master ~]# gluster peer probe 192.168.10.212
peer probe: success. 

2）檢視叢集狀態：
[root@GlusterFS-master ~]# gluster peer status
Number of Peers: 1
Hostname: 192.168.10.212
Uuid: f8e69297-4690-488e-b765-c1c404810d6a
State: Peer in Cluster (Connected)

3）檢視 volume 資訊（由於還沒有建立volume所以顯示的是暫無資訊）：
[root@GlusterFS-master ~]# gluster volume info
No volumes present

4）建立資料儲存目錄（在GlusterFS-master和GlusterFS-slave節點上都要操作）
[root@GlusterFS-master ~]# mkdir -p /opt/gluster/data

5）建立複製卷 models，指定剛剛建立的目錄（replica 2表明儲存2個備份，即有多少個節點就儲存多少個備份；後面指定伺服器的儲存目錄）。這裡選擇建立的是副本卷。
[root@GlusterFS-master ~]# gluster volume create models replica 2 192.168.10.239:/opt/gluster/data 192.168.10.212:/opt/gluster/data force

6）再次檢視 volume 資訊
[root@GlusterFS-master ~]# gluster volume info
 
Volume Name: models
Type: Replicate
Volume ID: f1945b0b-67d6-4202-9198-639244ab0a6a
Status: Created
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.10.239:/opt/gluster/data
Brick2: 192.168.10.212:/opt/gluster/data

7）啟動 models
[root@GlusterFS-master ~]# gluster volume start models

8）gluster 效能調優
a）首先開啟指定volume的配額 
[root@GlusterFS-master ~]# gluster volume quota models enable

b）限制 models 總目錄最大使用 5GB 空間（5GB並非絕對，需要根據實際硬碟大小配置） 
[root@GlusterFS-master ~]# gluster volume quota models limit-usage / 5GB

c）設定 cache 大小（128MB並非絕對，需要根據實際硬碟大小配置）
[root@GlusterFS-master ~]# gluster volume set models performance.cache-size 128MB

d）開啟非同步，後臺操作 
[root@GlusterFS-master ~]# gluster volume set models performance.flush-behind on

e）設定 io 執行緒 32 
[root@GlusterFS-master ~]# gluster volume set models performance.io-thread-count 32

f）設定 回寫 (寫資料時間，先寫入快取內，再寫入硬碟) 
[root@GlusterFS-master ~]# gluster volume set models performance.write-behind on

g）檢視調優之後的volume資訊
[root@GlusterFS-master ~]# gluster volume info
 
Volume Name: models
Type: Replicate
Volume ID: f1945b0b-67d6-4202-9198-639244ab0a6a
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.10.239:/opt/gluster/data
Brick2: 192.168.10.212:/opt/gluster/data
Options Reconfigured:
performance.write-behind: on
performance.io-thread-count: 32
performance.flush-behind: on
performance.cache-size: 128MB
features.quota: on

4）部署客戶端並掛載GlusterFS檔案系統的bricks（儲存單元）（在Client機器上操作）

到目前為止，GlusterFS分散式儲存叢集的大部分工作已經做完了，接下來就是掛載一個目錄，然後通過對這個掛載目錄操作，
實現資料同步至檔案系統，然後寫檔案測試下。

注意一點：
客戶端掛載的glusterfs儲存裡寫入的資料都是儲存到各節點伺服器的儲存目錄下。
如果節點機故障或其儲存目錄發生損壞，但是其備份節點正常，則客戶端掛載點下的資料就不會損失。
但如果發生故障的節點沒有備份節點或者備份節點都發生了故障，則客戶端掛載點下的資料就損失了。
如果建立的是分散式卷（即Hash卷，沒有備份節點），則如果有一個節點掛了，那麼客戶端的掛載點下的資料就會損失一部分。
 
1）安裝gluster-client
[root@Client ~]# yum install -y glusterfs glusterfs-fuse
 
2）建立掛載點目錄
[root@Client ~]# mkdir -p /opt/gfsmount
 
3）掛載GlusterFS
[root@Client ~]# mount -t glusterfs 192.168.10.239:models /opt/gfsmount/
 
4）檢查掛載情況
[root@Client ~]# df -h
Filesystem               Size  Used Avail Use% Mounted on
/dev/mapper/centos-root   38G  4.3G   33G  12% /
devtmpfs                 1.9G     0  1.9G   0% /dev
tmpfs                    1.9G     0  1.9G   0% /dev/shm
tmpfs                    1.9G  8.6M  1.9G   1% /run
tmpfs                    1.9G     0  1.9G   0% /sys/fs/cgroup
/dev/vda1               1014M  143M  872M  15% /boot
/dev/mapper/centos-home   19G   33M   19G   1% /home
tmpfs                    380M     0  380M   0% /run/user/0
overlay                   38G  4.3G   33G  12% /var/lib/docker/overlay2/9904ac8cbcba967de3262dc0d5e230c64ad3c1c53b588048e263767d36df8c1a/merged
shm                       64M     0   64M   0% /var/lib/docker/containers/222ec7f21b2495591613e0d1061e4405cd57f99ffaf41dbba1a98c350cd70f60/mounts/shm
192.168.10.239:models     38G  3.9G   34G  11% /opt/gfsmount
 
5）測試。分別建立30M、300M的兩個大檔案，發現速度很快。
[root@Client ~]# time dd if=/dev/zero of=/opt/gfsmount/kevin bs=30M count=1
1+0 records in
1+0 records out
31457280 bytes (31 MB) copied, 0.140109 s, 225 MB/s
 
real    0m0.152s
user    0m0.001s
sys 0m0.036s
 
[root@Client ~]# time dd if=/dev/zero of=/opt/gfsmount/grace bs=300M count=1
1+0 records in
1+0 records out
314572800 bytes (315 MB) copied, 1.07577 s, 292 MB/s
 
real    0m1.106s
user    0m0.001s
sys 0m0.351s
 
[root@Client ~]# cd /opt/gfsmount/
[root@Client gfsmount]# du -sh *
300M    grace
30M     kevin
[root@Client gfsmount]# mkdir test
[root@Client gfsmount]# ll
total 337924
-rw-r--r--. 1 root root 314572800 Apr  7 22:41 grace
-rw-r--r--. 1 root root  31457280 Apr  7 22:41 kevin
drwxr-xr-x. 2 root root      4096 Apr  7 22:43 test
 
6）檢視叢集儲存情況（在GlusterFS-master和GlusterFS-slave節點上操作）
[root@GlusterFS-master ~]# cd /opt/gluster/data/
[root@GlusterFS-master data]# ll
total 337920
-rw-r--r--. 2 root root 314572800 Apr  8 10:41 grace
-rw-r--r--. 2 root root  31457280 Apr  8 10:41 kevin
drwxr-xr-x. 2 root root         6 Apr  8 10:43 test
 
[root@GlusterFS-slave ~]# cd /opt/gluster/data/
[root@GlusterFS-slave data]# ll
total 337920
-rw-r--r--. 2 root root 314572800 Apr  7 22:41 grace
-rw-r--r--. 2 root root  31457280 Apr  7 22:41 kevin
drwxr-xr-x. 2 root root         6 Apr  7 22:43 test

備註：檢視得知gluster伺服器的每個節點上都有備份，符合上面步驟，即：建立複製卷 models，指定剛剛建立的目錄（replica 2表明儲存2個備份）

5）GlusterFS相關命令

1）檢視GlusterFS中所有的volume
[root@GlusterFS-master ~]# gluster volume list
models
 
2）啟動磁碟。比如啟動名字為 models 的磁碟
[root@GlusterFS-master ~]# gluster volume start models
 
3）停止磁碟。比如停止名字為 models 的磁碟
[root@GlusterFS-master ~]# gluster volume stop models
 
4）刪除磁碟。比如刪除名字為 models 的磁碟
[root@GlusterFS-master ~]# gluster volume delete models
 
5）驗證GlusterFS叢集。可以使用下面三個命令
[root@GlusterFS-master ~]# gluster peer status 
Number of Peers: 1
 
Hostname: 192.168.10.212
Uuid: f8e69297-4690-488e-b765-c1c404810d6a
State: Peer in Cluster (Connected)
 
[root@GlusterFS-master ~]# gluster pool list 
UUID                    Hostname        State
f8e69297-4690-488e-b765-c1c404810d6a    192.168.10.212  Connected
5dfd40e2-096b-40b5-bee3-003b57a39007    localhost       Connected
 
[root@GlusterFS-master ~]# gluster volume status
Status of volume: models
Gluster process                     Port    Online  Pid
------------------------------------------------------------------------------
Brick 192.168.10.239:/opt/gluster/data          49152   Y   1055
Brick 192.168.10.212:/opt/gluster/data          49152   Y   32586
NFS Server on localhost                 N/A N   N/A
Self-heal Daemon on localhost               N/A Y   1074
Quota Daemon on localhost               N/A Y   1108
NFS Server on 192.168.10.212                N/A N   N/A
Self-heal Daemon on 192.168.10.212          N/A Y   32605
Quota Daemon on 192.168.10.212              N/A Y   32614
  
Task Status of Volume models
------------------------------------------------------------------------------
There are no active volume tasks
 
 
6）將節點移出GlusterFS叢集，可以批量移除。如下將glusterfs3和glusterfs4兩個節點移除叢集。
如果是副本卷，移除的節點需要時replica的整數倍。
預設情況下節點是移除不了的,可以使用force強制移除（不建議強制移除節點）。
前提是移除的節點上的brick要移除。
[root@GlusterFS-master ~]# gluster peer detach glusterfs3 glusterfs4  force
或者通過gluster命令的互動模式進行操作：
[root@GlusterFS-master ~]# gluster
gluster> peer detach glusterfs3 glusterfs4  force
 
7）卷擴容（由於副本數設定為2,至少要新增2（4、6、8..）臺機器）。
需要特別注意：如果複製卷或者條帶卷，每次新增的Brick節點數必須是replica或者stripe的整數倍。
比如新增glusterfs3、glusterfs4兩個節點，並將這兩個節點的卷（即）合併，合併後的卷名稱為glusterfs_data。
[root@GlusterFS-master ~]# gluster peer probe glusterfs3   
[root@GlusterFS-master ~]# gluster peer probe glusterfs4   
[root@GlusterFS-master ~]# gluster volume add-brick glusterfs_data glusterfs3:/opt/gluster/data glusterfs4:/opt/gluster/data force
 
8）重新均衡卷（glusterfs_data為卷名）
[root@GlusterFS-master ~]# gluster volume rebalance glusterfs_data start 
[root@GlusterFS-master ~]# gluster volume rebalance glusterfs_data status 
[root@GlusterFS-master ~]# gluster volume rebalance glusterfs_data stop 
 
均衡卷的前提是至少有兩個brick儲存單元（即至少3個節點叢集）。
上面的例子中，models卷中只有一個brick儲存單元，故不能進行均衡卷操作：
[root@GlusterFS-master ~]# gluster volume list
models
[root@GlusterFS-master ~]# gluster volume rebalance models start
volume rebalance: models: failed: Volume models is not a distribute volume or contains only 1 brick.
Not performing rebalance
[root@GlusterFS-master ~]#
 
9）收縮卷（收縮卷前gluster需要先移動資料到其他位置）（gv0為卷名）。注意，如果是複製卷或者條帶卷，則每次移除的Brick數必須是replica或者stripe的整數倍。
[root@GlusterFS-master ~]# gluster volume remove-brick gv0 glusterfs3:/data/brick1/gv0 glusterfs4:/data/brick1/gv0 start      //開始遷移  
[root@GlusterFS-master ~]# gluster volume remove-brick gv0 glusterfs3:/data/brick1/gv0 glusterfs4:/data/brick1/gv0 status     //檢視遷移狀態  
[root@GlusterFS-master ~]# gluster volume remove-brick gv0 glusterfs3:/data/brick1/gv0 glusterfs4:/data/brick1/gv0 commit     //遷移完成後提交  
 
10）遷移卷
 
#將glusterfs3的資料遷移到glusterfs5,先將glusterfs5加入叢集
[root@GlusterFS-master ~]# gluster peer probe glusterfs5  
 
#開始遷移
[root@GlusterFS-master ~]# gluster volume replace-brick gv0 glusterfs3:/data/brick1/gv0 glusterfs5:/data/brick1/gv0 start 
 
#檢視遷移狀態 
[root@GlusterFS-master ~]# gluster volume replace-brick gv0 glusterfs3:/data/brick1/gv0 glusterfs5:/data/brick1/gv0 status
 
#資料遷移完畢後提交  
[root@GlusterFS-master ~]# gluster volume replace-brick gv0 glusterfs3:/data/brick1/gv0 glusterfs5:/data/brick1/gv0 commit   
 
#如果機器glusterfs3出現故障已經不能執行,執行強制提交  
[root@GlusterFS-master ~]# gluster volume replace-brick gv0 glusterfs3:/data/brick1/gv0 glusterfs5:/data/brick1/gv0 commit force
 
#同步整個卷  
[root@GlusterFS-master ~]# gluster volume heal gfs full
 
11）授權訪問。如下授權192.168網段的客戶機能訪問此glusterfs儲存卷。
[root@GlusterFS-master ~]# gluster volume set gfs auth.allow 192.168.*

6）總結幾點

如上操作後，GlusterFS的分散式儲存叢集環境已經搭建完成。這裡總結幾點如下：
1）如果Glusterfs節點機器重啟，那麼重啟後：
   a）glusterFS服務需要啟動
   b）磁碟models（即儲存卷）需要啟動
   c）目錄/opt/gfsmount/需要重新掛載
   d）掛載完目錄/opt/gfsmount/需要重新進入

2）注意：
兩個分割槽掛到同一個分割槽，第一個掛的那個不是被覆蓋，而是被暫時隱藏。比如：
先掛載的"mount /dev/sda1 /opt/gfsmount/"，接著又掛載的"mount /dev/sda2 /opt/gfsmount/"，
那麼/dev/sda1內的就暫時被隱藏，只要"umount /dev/sda2"，把第二個分割槽解除安裝了，
在"cd /opt/gfsmount/"就可以看到掛的第一個分割槽的內容了。

3）gluster的埠
glusterd程式的埠號是24007
glusterfsd程式的埠號是49153

4）建立的六種儲存卷
---------------建立複製卷---------------
語法：gluster volume create NEW-VOLNAME [replica COUNT] [transport tcp | rdma | tcp, rdma] NEW-BRICK
示例
[root@GlusterFS-master ~]# gluster volume create test-volume replica 2 transport tcp server1:/exp1/brick server2:/exp2/brick

---------------建立條帶卷---------------
語法：gluster volume create NEW-VOLNAME [stripe COUNT] [transport tcp | rdma | tcp, rdma] NEW-BRICK...
示例：
[root@GlusterFS-master ~]# gluster volume create test-volume stripe 2 transport tcp server1:/exp1/brick server2:/exp2/brick

---------------建立分散式卷（即Hash雜湊卷）---------------
語法：gluster volume create NEW-VOLNAME [transport tcp | rdma | tcp, rdma] NEW-BRICK
示例1
[root@GlusterFS-master ~]# gluster volume create test-volume server1:/exp1/brick server2:/exp2/brick
示例2
[root@GlusterFS-master ~]# gluster volume create test-volume transport rdma server1:/exp1/brick server2:/exp2/brick server3:/exp3/brick server4:/exp4/brick

---------------分散式複製卷---------------
語法：gluster volume create NEW-VOLNAME [replica COUNT] [transport tcp | rdma | tcp, rdma] NEW-BRICK...
示例： 
[root@GlusterFS-master ~]# gluster volume create test-volume replica 2 transport tcp server1:/exp1/brick server2:/exp2/brick server3:/exp3/brick server4:/exp4/brick

---------------分散式條帶卷---------------
語法：gluster volume create NEW-VOLNAME [stripe COUNT] [transport tcp | rdma | tcp, rdma] NEW-BRICK...
示例：
[root@GlusterFS-master ~]# gluster volume create test-volume stripe 2 transport tcp server1:/exp1/brick server2:/exp2/brick server3:/exp3/brick server4:/exp4/brick

---------------條帶複製卷---------------
語法：gluster volume create NEW-VOLNAME [stripe COUNT] [replica COUNT] [transport tcp | rdma | tcp, rdma] NEW-BRICK...
示例：
[root@GlusterFS-master ~]# gluster volume create test-volume stripe 2 replica 2 transport tcp server1:/exp1/brick server2:/exp2/brick server3:/exp3/brick server4:/exp4/brick


5）檢視卷
[root@GlusterFS-master ~]# gluster volume list              //列出叢集中的所有卷*/
[root@GlusterFS-master ~]# gluster volume info [all]        //檢視叢集中的卷資訊*/
[root@GlusterFS-master ~]# gluster volume status [all]      //檢視叢集中的卷狀態*/

6）Brick管理
這裡以一個例子來說明：把192.168.10.151:/mnt/brick0 替換為192.168.10.151:/mnt/brick2

6.1）開始替換
[root@GlusterFS-slave ~]# gluster volume replace-brick test-volume 192.168.10.151:/mnt/brick0 192.168.10.152:/mnt/brick2 start
異常資訊：volume replace-brick: failed: /data/share2 or a prefix of it is already part of a volume 

說明 /mnt/brick2 曾經是一個Brick。具體解決方法
[root@GlusterFS-slave ~]# rm -rf /mnt/brick2/.glusterfs

[root@GlusterFS-slave ~]# setfattr -x trusted.glusterfs.volume-id /mnt/brick2
[root@GlusterFS-slave ~]# setfattr -x trusted.gfid  /mnt/brick2

//如上，執行replcace-brick卷替換啟動命令，使用start啟動命令後，開始將原始Brick的資料遷移到即將需要替換的Brick上。

6.2）檢視是否替換完
[root@GlusterFS-slave ~]# gluster volume replace-brick test-volume 192.168.10.151:/mnt/brick0 192.168.10.152:/mnt/brick2 status

6.3）在資料遷移的過程中，可以執行abort命令終止Brick替換。
[root@GlusterFS-slave ~]# gluster volume replace-brick test-volume 192.168.10.151:/mnt/brick0 192.168.10.152:/mnt/brick2 abort

6.4）在資料遷移結束之後，執行commit命令結束任務，則進行Brick替換。使用volume info命令可以檢視到Brick已經被替換。
[root@GlusterFS-slave ~]# gluster volume replace-brick test-volume 192.168.10.151:/mnt/brick0 192.168.10.152:/mnt/brick2 commit
此時再往 /sf/data/vs/gfs/rep2上新增資料的話，資料會同步到 192.168.10.152:/mnt/brick0和192.168.10.152:/mnt/brick2上。而不會同步到
192.168.10.151:/mnt/brick0 上。 

7）收縮卷
先將資料遷移到其它可用的Brick，遷移結束後才將該Brick移除：
[root@GlusterFS-slave ~]# gluster volume remove-brick <VOLNAME> <BRICK> start
 
在執行了start之後，可以使用status命令檢視移除進度：
[root@GlusterFS-slave ~]# gluster volume remove-brick <VOLNAME> <BRICK> status
 
不進行資料遷移，直接刪除該Brick：
[root@GlusterFS-slave ~]# gluster volume remove-brick <VOLNAME> <BRICK> commit
 
注意，如果是複製卷或者條帶卷，則每次移除的Brick數必須是replica或者stripe的整數倍。

8）遷移卷
使用start命令開始進行遷移：
[root@GlusterFS-slave ~]# gluster volume replace-brick <VOLNAME> <BRICK> <NEW-BRICK> start

在資料遷移過程中，可以使用pause命令暫停遷移：
[root@GlusterFS-slave ~]# gluster volume replace-brick <VOLNAME> <BRICK> <NEW-BRICK> pause
 
在資料遷移過程中，可以使用abort命令終止遷移：
[root@GlusterFS-slave ~]# gluster volume replace-brick <VOLNAME> <BRICK> <NEW-BRICK> abort
 
在資料遷移過程中，可以使用status命令檢視遷移進度：
[root@GlusterFS-slave ~]# gluster volume replace-brick <VOLNAME> <BRICK> <NEW-BRICK> status
 
在資料遷移結束後，執行commit命令來進行Brick替換：
[root@GlusterFS-slave ~]# gluster volume replace-brick <VOLNAME> <BRICK> <NEW-BRICK> commit

7）幾個問題

1）重新新增一個曾經交付的brick，報錯：
異常資訊：volume replace-brick: failed: /data/share2 or a prefix of it is already part of a volume

說明 /mnt/brick2 曾經是一個Brick。具體解決方法　
[root@GlusterFS-slave ~]# rm -rf /mnt/brick2/.glusterfs
[root@GlusterFS-slave ~]# setfattr -x trusted.glusterfs.volume-id /mnt/brick2  // 移除目錄的擴充套件屬性
[root@GlusterFS-slave ~]# setfattr -x trusted.gfid  /mnt/brick2

2）在glusterfs掛載點上，操作，出現：傳輸端點尚未連線
通過 volume info 檢視正常。
[root@GlusterFS-slave ~]# ls
ls: 無法開啟目錄.: 傳輸端點尚未連線
[root@GlusterFS-slave ~]# df
檔案系統             1K-塊      已用      可用 已用% 掛載點
/dev/mapper/vg_vclassftydc-lv_root
                      13286512   3914004   8674552  32% /
tmpfs                  2013148         4   2013144   1% /dev/shm
/dev/sda1               487652     31759    426197   7% /boot
/dev/mapper/vg_vclassftydc-lv_var
                      20511356    496752  18949644   3% /var
/dev/mapper/ssd-ssd   20836352   9829584  11006768  48% /mnt/787fe74d9bef93c17a7aa195a05245b3
/dev/mapper/defaultlocal-defaultlocal
                      25567232  10059680  15507552  40% /mnt/ab796251092bf7c1d7657e728940927b
df: "/mnt/glusterfs-mnt": 傳輸端點尚未連線
[root@GlusterFS-slave ~]# cd ..
-bash: cd: ..: 傳輸端點尚未連線
[root@GlusterFS-slave ~]# 

原因：glusterfs的掛載目錄已經umount了。 
掛載glusterfs後，一段時間發現被umount，掛載日誌如下： 
[2017-06-28 05:45:51.861186] I [dht-layout.c:726:dht_layout_dir_mismatch] 0-test-vol2-dht: / - disk layout missing
[2017-06-28 05:45:51.861332] I [dht-common.c:623:dht_revalidate_cbk] 0-test-vol2-dht: mismatching layouts for /
[2017-06-28 05:46:02.499449] I [dht-layout.c:726:dht_layout_dir_mismatch] 0-test-vol2-dht: / - disk layout missing
[2017-06-28 05:46:02.499520] I [dht-common.c:623:dht_revalidate_cbk] 0-test-vol2-dht: mismatching layouts for /
[2017-06-28 05:46:05.701259] I [fuse-bridge.c:4628:fuse_thread_proc] 0-fuse: unmounting /mnt/glusterfs-mnt/
[2017-06-28 05:46:05.701898] W [glusterfsd.c:1002:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x7f09e546090d] (-->/lib64/libpthread.so.0(+0x7851) [0x7f09e5aff851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x40533d]))) 0-: received signum (15), shutting down
[2017-06-28 05:46:05.701911] I [fuse-bridge.c:5260:fini] 0-fuse: Unmounting '/mnt/glusterfs-mnt/'.

3）恢復節點配置資訊
故障現象: 其中一個節點配置資訊不正確
故障模擬:刪除server2部分配置資訊，配置資訊位置:/var/lib/glusterd/
修復方法：觸發自修復:通過Gluster工具同步配置資訊
[root@GlusterFS-slave ~]# gluster volume sync server1 all

4）複製卷資料不一致
故障現象: 雙副本卷資料出現不一致
故障模擬: 刪除其中一個brick資料
修復方法：重新mount一下

5）複製卷的目錄刪除了
修復方法：先替換brick
[root@GlusterFS-slave ~]# gluster volume replace-brick bbs_img 10.20.0.201:/brick1/share2 10.20.0.201:/brick1/share start
[root@GlusterFS-slave ~]# gluster volume replace-brick bbs_img 10.20.0.201:/brick1/share2 10.20.0.201:/brick1/share commit

還原的時候注意會報錯
解決辦法
[root@GlusterFS-slave ~]# rm -rf /data/share2/.glusterfs
[root@GlusterFS-slave ~]# setfattr -x  trusted.glusterfs.volume-id /data/share2
[root@GlusterFS-slave ~]# setfattr -x trusted.gfid /data/share2

6）移除brick出現如下問題
[root@GlusterFS-slave ~]# volume remove-brick test_vol01 192.168.10.186:/mnt/63ef41a63399e6640a3c4abefa725497 192.168.10.186:/mnt/ad242fbe177ba330a0ea75a9d23fc936 force
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit force: failed: One or more nodes do not support the required op-version. Cluster op-version must atleast be 30600.
原因：節點版本不一致造成的

7）gluster 掛載點無法使用，無法umount
[root@GlusterFS-slave ~]# umount /mnt/d1f561a32ac1bf17cf183f36baac34d4
umount: /mnt/d1f561a32ac1bf17cf183f36baac34d4: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
原因：有一個glusterfs  程式正在使用。通過ps 檢視 引數中有mount的，kill掉後，即可umount操作。

===========================新增節點操作===========================

上面案例只是用到了兩個節點機，在實際生產環境中，可能用到的節點會更多。下面簡單說下追加節點的操作：
由於上面建立的是副本卷，新增加的節點數必須是replica的整數倍，故新增加的節點數至少是兩個。

這裡新增加兩個節點機的資訊如下：
192.168.10.204  GlusterFS-slave2
192.168.10.220  GlusterFS-slave3

1）首先在四個幾點機器的/etc/hosts裡面做好hosts繫結
192.168.10.239  GlusterFS-master
192.168.10.212  GlusterFS-slave
192.168.10.204  GlusterFS-slave2
192.168.10.220  GlusterFS-slave3

2）關閉防火牆，以及同步好系統時間

3）安裝上面的操作步驟在新增加的兩個節點機上安裝安裝userspace-rcu-master和userspace-rcu-master

4）在新增加的兩個節點機上建立儲存目錄
# mkdir -p /opt/gluster/data

5）建立GlusterFS叢集（在GlusterFS-master機器上操作）
[root@GlusterFS-master ~]# gluster peer probe 192.168.10.204
[root@GlusterFS-master ~]# gluster peer probe 192.168.10.220
---------------------------------------------------------------------
如果出現下面報錯：
peer probe: failed: Probe returned with unknown errno 107
產生原因：
目標伺服器上的防火牆是否關閉？、glusterd是否啟動執行？能否ping的通？
---------------------------------------------------------------------

[root@GlusterFS-master ~]# gluster peer status         //在每個節點機上都可以執行這個檢視命令
Number of Peers: 3

Hostname: 192.168.10.212
Uuid: f8e69297-4690-488e-b765-c1c404810d6a
State: Peer in Cluster (Connected)

Hostname: 192.168.10.204
Uuid: a989394c-f64a-40c3-8bc5-820f623952c4
State: Peer in Cluster (Connected)

Hostname: 192.168.10.220
Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965
State: Peer in Cluster (Connected)

[root@GlusterFS-master ~]# gluster volume info        //在每個節點機上都可以執行這個檢視命令
 
Volume Name: models
Type: Replicate
Volume ID: f1945b0b-67d6-4202-9198-639244ab0a6a
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.10.239:/opt/gluster/data
Brick2: 192.168.10.212:/opt/gluster/data
Options Reconfigured:
performance.write-behind: on
performance.io-thread-count: 32
performance.flush-behind: on
performance.cache-size: 128MB
features.quota: on


6）卷擴容（即將新新增的兩個節點的brick新增到上面的models磁碟裡）

首先將Client客戶機上之前的glusterfs掛載解除安裝掉
[root@Client ~]# umount /opt/gfsmount

然後將GlusterFS-master節點上的models關閉
[root@GlusterFS-master ~]# gluster volume stop models
[root@GlusterFS-master ~]# gluster volume status models

然後執行卷擴容操作
[root@GlusterFS-master ~]# gluster volume add-brick models 192.168.10.204:/opt/gluster/data 192.168.10.220:/opt/gluster/data force

檢視volume 資訊，發現新節點已經加入進去了
[root@GlusterFS-master ~]# gluster volume info
Volume Name: models
Type: Distributed-Replicate
Volume ID: f1945b0b-67d6-4202-9198-639244ab0a6a
Status: Stopped
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 192.168.10.239:/opt/gluster/data
Brick2: 192.168.10.212:/opt/gluster/data
Brick3: 192.168.10.204:/opt/gluster/data
Brick4: 192.168.10.220:/opt/gluster/data
Options Reconfigured:
performance.write-behind: on
performance.io-thread-count: 32
performance.flush-behind: on
performance.cache-size: 128MB
features.quota: on

然後在GlusterFS-master重新啟動models磁碟
[root@GlusterFS-master ~]# gluster volume start models
volume start: models: success
[root@GlusterFS-master ~]# gluster volume status models
Status of volume: models
Gluster process           Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.10.239:/opt/gluster/data      49152 Y 5405
Brick 192.168.10.212:/opt/gluster/data      49152 Y 2665
Brick 192.168.10.204:/opt/gluster/data      49152 Y 8788
Brick 192.168.10.220:/opt/gluster/data      49152 Y 12117
NFS Server on localhost         N/A N N/A
Self-heal Daemon on localhost       N/A Y 5426
Quota Daemon on localhost       N/A Y 5431
NFS Server on 192.168.10.212        N/A N N/A
Self-heal Daemon on 192.168.10.212      N/A Y 2684
Quota Daemon on 192.168.10.212        N/A Y 2691
NFS Server on 192.168.10.204        N/A N N/A
Self-heal Daemon on 192.168.10.204      N/A Y 8807
Quota Daemon on 192.168.10.204        N/A Y 8814
NFS Server on 192.168.10.220        N/A N N/A
Self-heal Daemon on 192.168.10.220      N/A Y 12136
Quota Daemon on 192.168.10.220        N/A Y 12143
 
Task Status of Volume models
------------------------------------------------------------------------------
There are no active volume tasks


接著在Client客戶機重新掛載glusterfs
[root@Client ~]# mount -t glusterfs 192.168.10.239:models /opt/gfsmount/
[root@Client ~]# df -h
Filesystem               Size  Used Avail Use% Mounted on
.......
192.168.10.239:models     75G  6.0G   69G   8% /opt/gfsmount

然後進行重新均衡。均衡卷的前提是至少有兩個brick儲存單元（即至少3個節點叢集）。
[root@GlusterFS-master data]# gluster volume rebalance models start 
volume rebalance: models: success: Initiated rebalance on volume models.
Execute "gluster volume rebalance <volume-name> status" to check status.
ID: 2ff550ae-043d-457a-8af6-873d9f4ce7ee

最後再Client機器上進行測試：
注意：
1）新新增到models卷裡的節點的儲存目錄裡不會有之前其他節點的資料，只會有新寫入的資料。
2）由於上面建立副本卷的時候，指定的副本是2個（即replica 2），所以新寫入的資料只會均衡地寫入到其中的兩個節點的儲存目錄下，不會四個節點的儲存目錄都寫入的。

在客戶機上做測試，發現：
1）在glusterfs掛載目錄裡建立的檔案，只會同步到其中的兩個節點上。如果是小檔案，會明顯固定同步到兩個節點上；如果是大檔案，每次會同步到不同的兩個節點上。
2）在glusterfs掛載目錄裡建立的目錄，會同步到所有的節點上。但是在該目錄下建立檔案，只會同步到其中的兩個節點上。如果是小檔案，會明顯固定同步到兩個節點上；
   如果是大檔案，每次會同步到不同的兩個節點上。

也可以在GlusterFS-master上執行"重新均衡卷"的操作。均衡卷執行後，首先會將所有節點的儲存目錄資料保持一致（以最多資料的那個節點為準，進行同步一致）。然後重新測試資料，測試的效果跟上面描述的一樣。
[root@GlusterFS-master data]# gluster volume rebalance models start

GlusterFS分散式儲存叢集部署記錄-相關補充
2018-04-10
分散式
GlusterFS分散式儲存學習筆記
2018-04-03
分散式筆記
Centos7下使用Ceph-deploy快速部署Ceph分散式儲存-操作記錄
2018-06-05
CentOS分散式
Centos7下ELK+Redis日誌分析平臺的叢集環境部署記錄
2018-05-29
CentOSRedis
ProxySQL Cluster 高可用叢集環境部署記錄
2019-02-21
SQL
分散式儲存glusterfs詳解【轉】
2024-11-13
分散式
GlusterFS企業分散式儲存【轉】
2024-11-13
分散式
Hadoop框架：叢集模式下分散式環境搭建
2020-09-27
Hadoop框架模式分散式
GlusterFS分散式儲存系統中更換故障Brick的操作記錄
2018-04-09
分散式
在CentOS7環境下部署weblogic叢集
2021-07-13
CentOSWeb
分散式系統與叢集環境
2018-08-22
分散式
分散式儲存在雲環境下的應用和部署
2023-03-02
分散式
kafka 基礎知識梳理及叢集環境部署記錄
2018-05-10
Kafka
[Hadoop踩坑]叢集分散式環境配置
2021-09-09
Hadoop分散式
Redis叢集環境下的-RedLock(真分散式鎖) 實踐
2018-11-11
Redis分散式
分散式kv儲存系統之Etcd叢集
2021-01-30
分散式
02 . 分散式儲存之FastDFS 高可用叢集部署
2020-07-03
分散式AST
Redis 4.0叢集環境部署
2018-06-11
Redis
分散式文件儲存資料庫之MongoDB分片叢集
2020-11-12
分散式資料庫MongoDB
Centos下Nodejs+npm環境-部署記錄
2018-05-04
CentOSNodeJSNPM
崑崙分散式資料庫儲存叢集 Fullsync 機制
2022-04-07
分散式資料庫
Hadoop的叢集環境部署說明
2018-10-12
Hadoop
Linux系統下jmeter 分散式壓測環境部署
2024-04-27
LinuxJMeter分散式
Hadoop HA叢集與開發環境部署
2021-11-17
Hadoop開發環境
分散式儲存轉崗記
2021-10-08
分散式
redis叢集之分片叢集的原理和常用代理環境部署
2022-12-01
Redis
004.MinIO-DirectPV分散式儲存部署
2024-08-24
分散式
CentOS 6.5下ZooKeeper3.4.6叢集環境部署及單機部署詳解
2018-04-10
CentOS
GlusterFS分散式儲存資料的恢復機制(AFR)的說明
2018-04-09
分散式
LNMP 分散式叢集（五）：ThinkPHP專案部署
2020-03-18
LNMP分散式PHP
JEESZ分散式框架開發環境部署
2018-08-02
分散式框架開發環境
Mac 環境下 Redis 叢集的搭建
2021-09-09
MacRedis
在 Kubernetes 上快速測試 Citus 分散式 PostgreSQL 叢集(分散式表，共置，引用表，列儲存)
2022-03-24
分散式SQL
Centos7搭建hadoop3.3.4分散式叢集
2023-02-21
CentOSHadoop分散式
雲原生環境下的日誌採集、儲存、分析實踐
2022-04-28
ElasticSearch 分散式叢集
2021-03-01
Elasticsearch分散式
用 Vagrant 一鍵部署開發環境 kafka 叢集
2020-02-24
開發環境Kafka
Centos7部署Redis叢集
2022-05-19
CentOSRedis

Centos7下GlusterFS分散式儲存叢集環境部署記錄

相關文章