redis-trib.rb是官方提供的Redis Cluster的管理工具,無需額外下載,預設位於原始碼包的src目錄下,但因該工具是用ruby開發的,所以需要準備相關的依賴環境。
準備redis-trib.rb的執行環境
wget https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.1.tar.gz
tar xvf ruby-2.5.1.tar.gz
cd ruby-2.5.1/
./configure -prefix=/usr/local/ruby
make
make install
cd /usr/local/ruby/
cp bin/ruby /usr/local/bin
cp bin/gem /usr/local/bin
安裝rubygem redis依賴
wget http://rubygems.org/downloads/redis-3.3.0.gem
gem install -l redis-3.3.0.gem
redis-trib.rb支援的操作
# redis-trib.rb help Usage: redis-trib <command> <options> <arguments ...> create host1:port1 ... hostN:portN --replicas <arg> check host:port info host:port fix host:port --timeout <arg> reshard host:port --from <arg> --to <arg> --slots <arg> --yes --timeout <arg> --pipeline <arg> rebalance host:port --weight <arg> --auto-weights --use-empty-masters --timeout <arg> --simulate --pipeline <arg> --threshold <arg> add-node new_host:new_port existing_host:existing_port --slave --master-id <arg> del-node host:port node_id set-timeout host:port milliseconds call host:port command arg arg .. arg import host:port --from <arg> --copy --replace help (show this help) For check, fix, reshard, del-node, set-timeout you can specify the host and port of any working node in the cluster.
支援的操作如下:
1. create:建立叢集
2. check:檢查叢集
3. info:檢視叢集資訊
4. fix:修復叢集
5. reshard:線上遷移slot
6. rebalance:平衡叢集節點slot數量
7. add-node:新增新節點
8. del-node:刪除節點
9. set-timeout:設定節點的超時時間
10. call:在叢集所有節點上執行命令
11. import:將外部redis資料匯入叢集
建立叢集
redis-trib.rb create –replicas 1 127.0.0.1:6379 127.0.0.1:6380 127.0.0.1:6381 127.0.0.1:6382 127.0.0.1:6383 127.0.0.1:6384
–replicas引數指定叢集中每個主節點配備幾個從節點,這裡設定為1。
>>> Creating cluster /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated >>> Performing hash slots allocation on 6 nodes... Using 3 masters: 127.0.0.1:6379 127.0.0.1:6380 127.0.0.1:6381 Adding replica 127.0.0.1:6383 to 127.0.0.1:6379 Adding replica 127.0.0.1:6384 to 127.0.0.1:6380 Adding replica 127.0.0.1:6382 to 127.0.0.1:6381 >>> Trying to optimize slaves allocation for anti-affinity [WARNING] Some slaves are in the same host as their master M: bc775f9c4dea40820b82c9451778b1fcd42f92bc 127.0.0.1:6379 slots:0-5460 (5461 slots) master M: 3b27d00d13706a032a92ff6b0a914af272dcaaf2 127.0.0.1:6380 slots:5461-10922 (5462 slots) master M: d874f003257f1fb036bbd856ca605172a1741232 127.0.0.1:6381 slots:10923-16383 (5461 slots) master S: 648eb314863b82aaa676380be7db2ec307f5547d 127.0.0.1:6382 replicates bc775f9c4dea40820b82c9451778b1fcd42f92bc S: 65a6efb441ac44c348f7da8c62e26b888cda7c48 127.0.0.1:6383 replicates 3b27d00d13706a032a92ff6b0a914af272dcaaf2 S: 57bda956485109552547aef6c77fba43d2124abf 127.0.0.1:6384 replicates d874f003257f1fb036bbd856ca605172a1741232 Can I set the above configuration? (type `yes` to accept): yes >>> Nodes configuration updated >>> Assign a different config epoch to each node >>> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join... >>> Performing Cluster Check (using node 127.0.0.1:6379) M: bc775f9c4dea40820b82c9451778b1fcd42f92bc 127.0.0.1:6379 slots:0-5460 (5461 slots) master 1 additional replica(s) S: 648eb314863b82aaa676380be7db2ec307f5547d 127.0.0.1:6382 slots: (0 slots) slave replicates bc775f9c4dea40820b82c9451778b1fcd42f92bc M: 3b27d00d13706a032a92ff6b0a914af272dcaaf2 127.0.0.1:6380 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: 57bda956485109552547aef6c77fba43d2124abf 127.0.0.1:6384 slots: (0 slots) slave replicates d874f003257f1fb036bbd856ca605172a1741232 S: 65a6efb441ac44c348f7da8c62e26b888cda7c48 127.0.0.1:6383 slots: (0 slots) slave replicates 3b27d00d13706a032a92ff6b0a914af272dcaaf2 M: d874f003257f1fb036bbd856ca605172a1741232 127.0.0.1:6381 slots:10923-16383 (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
16384個槽全部被分配,叢集建立成功。注意:給redis-trib.rb的節點地址必須是不包含任何槽/資料的節點,否則會拒絕建立叢集。
>>> Creating cluster /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated [ERR] Node 127.0.0.1:6379 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0.
關於主從節點的選擇及槽的分配,其演算法如下:
1> 把節點按照host分類,這樣保證master節點能分配到更多的主機中。
2> 遍歷host列表,從每個host列表中彈出一個節點,放入interleaved陣列。直到所有的節點都彈出為止。
3> 將interleaved陣列中前master個數量的節點儲存到masters陣列中。
4> 計算每個master節點負責的slot數量,16384除以master數量取整,這裡記為N。
5> 遍歷masters陣列,每個master分配N個slot,最後一個master,分配剩下的slot。
6> 接下來為master分配slave,分配演算法會盡量保證master和slave節點不在同一臺主機上。對於分配完指定slave數量的節點,還有多餘的節點,也會為這些節點尋找master。分配演算法會遍歷兩次masters陣列。
7> 第一次遍歷master陣列,在餘下的節點列表找到replicas數量個slave。每個slave為第一個和master節點host不一樣的節點,如果沒有不一樣的節點,則直接取出餘下列表的第一個節點。
8> 第二次遍歷是分配節點數除以replicas不為整數而多出的一部分節點。
檢查叢集狀態
redis-trib.rb check 127.0.0.1:6379
指定任意一個節點即可。
/usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated >>> Performing Cluster Check (using node 127.0.0.1:6379) M: bc775f9c4dea40820b82c9451778b1fcd42f92bc 127.0.0.1:6379 slots:0-5460 (5461 slots) master 1 additional replica(s) S: 648eb314863b82aaa676380be7db2ec307f5547d 127.0.0.1:6382 slots: (0 slots) slave replicates bc775f9c4dea40820b82c9451778b1fcd42f92bc M: 3b27d00d13706a032a92ff6b0a914af272dcaaf2 127.0.0.1:6380 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: 57bda956485109552547aef6c77fba43d2124abf 127.0.0.1:6384 slots: (0 slots) slave replicates d874f003257f1fb036bbd856ca605172a1741232 S: 65a6efb441ac44c348f7da8c62e26b888cda7c48 127.0.0.1:6383 slots: (0 slots) slave replicates 3b27d00d13706a032a92ff6b0a914af272dcaaf2 M: d874f003257f1fb036bbd856ca605172a1741232 127.0.0.1:6381 slots:10923-16383 (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
檢視叢集資訊
redis-trib.rb info 127.0.0.1:6383
/usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated 127.0.0.1:6380 (3b27d00d...) -> 0 keys | 5462 slots | 1 slaves. 127.0.0.1:6381 (d874f003...) -> 1 keys | 5461 slots | 1 slaves. 127.0.0.1:6379 (bc775f9c...) -> 0 keys | 5461 slots | 1 slaves. [OK] 1 keys in 3 masters. 0.00 keys per slot on average.
修復叢集
目前fix命令能修復兩種異常,
1. 節點中存在處於遷移中(importing或migrating狀態)的slot。
2. 節點中存在未分配的slot。
其它異常不能通過fix命令修復。
[root@slowtech conf]# redis-trib.rb fix 127.0.0.1:6379 /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated >>> Performing Cluster Check (using node 127.0.0.1:6379) S: d826c5fd98efa8a17a880e9a90a25f06c88e6ae9 127.0.0.1:6379 slots: (0 slots) slave replicates a8b3d0f9b12d63dab3b7337d602245d96dd55844 S: 55c05d5b0dfea0d52f88548717ddf24975268de6 127.0.0.1:6383 slots: (0 slots) slave replicates a8b3d0f9b12d63dab3b7337d602245d96dd55844 M: f413fb7e6460308b17cdb71442798e1341b56cbc 127.0.0.1:6381 slots:50-16383 (16334 slots) master 2 additional replica(s) S: beba753c5a63607fa66d9ec7427ed9a511ea136e 127.0.0.1:6382 slots: (0 slots) slave replicates f413fb7e6460308b17cdb71442798e1341b56cbc S: 83797d518e56c235272402611477f576973e9d34 127.0.0.1:6384 slots: (0 slots) slave replicates f413fb7e6460308b17cdb71442798e1341b56cbc M: a8b3d0f9b12d63dab3b7337d602245d96dd55844 127.0.0.1:6380 slots:0-49 (50 slots) master 2 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
線上遷移slot
互動環境中使用
如,
redis-trib.rb reshard 127.0.0.1:6379
指定任意一個節點即可。
/usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated >>> Performing Cluster Check (using node 127.0.0.1:6379) M: bc775f9c4dea40820b82c9451778b1fcd42f92bc 127.0.0.1:6379 slots:3225-5460 (2236 slots) master 1 additional replica(s) S: 648eb314863b82aaa676380be7db2ec307f5547d 127.0.0.1:6382 slots: (0 slots) slave replicates bc775f9c4dea40820b82c9451778b1fcd42f92bc M: 3b27d00d13706a032a92ff6b0a914af272dcaaf2 127.0.0.1:6380 slots:0-3224,5461-13958 (11723 slots) master 1 additional replica(s) S: 57bda956485109552547aef6c77fba43d2124abf 127.0.0.1:6384 slots: (0 slots) slave replicates d874f003257f1fb036bbd856ca605172a1741232 S: 65a6efb441ac44c348f7da8c62e26b888cda7c48 127.0.0.1:6383 slots: (0 slots) slave replicates 3b27d00d13706a032a92ff6b0a914af272dcaaf2 M: d874f003257f1fb036bbd856ca605172a1741232 127.0.0.1:6381 slots:13959-16383 (2425 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. How many slots do you want to move (from 1 to 16384)? 200 What is the receiving node ID? 3b27d00d13706a032a92ff6b0a914af272dcaaf2 Please enter all the source node IDs. Type `all` to use all the nodes as source nodes for the hash slots. Type `done` once you entered all the source nodes IDs. Source node #1:
它首先會提示需要遷移多個槽,我這裡寫的是200。
接著它會提示需要將槽遷移到哪個節點,這裡必須寫節點ID。
緊跟著它會提示槽從哪些節點中遷出。
如果指定為all,則待遷移的槽在剩餘節點中平均分配,在這裡,127.0.0.1:6379和127.0.0.1:6381各遷移100個槽出來。
也可從指定節點中遷出,這個時候,必須指定源節點的節點ID,最後以done結束,如下所示,
Source node #1:bc775f9c4dea40820b82c9451778b1fcd42f92bc Source node #2:done Ready to move 200 slots. Source nodes: M: bc775f9c4dea40820b82c9451778b1fcd42f92bc 127.0.0.1:6379 slots:3225-5460 (2236 slots) master 1 additional replica(s) Destination node: M: 3b27d00d13706a032a92ff6b0a914af272dcaaf2 127.0.0.1:6380 slots:0-3224,5461-13958 (11723 slots) master 1 additional replica(s) Resharding plan: Moving slot 3225 from bc775f9c4dea40820b82c9451778b1fcd42f92bc Moving slot 3226 from bc775f9c4dea40820b82c9451778b1fcd42f92bc Moving slot 3227 from bc775f9c4dea40820b82c9451778b1fcd42f92bc ... Do you want to proceed with the proposed reshard plan (yes/no)? yes Moving slot 3225 from 127.0.0.1:6379 to 127.0.0.1:6380: . Moving slot 3226 from 127.0.0.1:6379 to 127.0.0.1:6380: Moving slot 3227 from 127.0.0.1:6379 to 127.0.0.1:6380: .. Moving slot 3228 from 127.0.0.1:6379 to 127.0.0.1:6380: ...
最後,提示是否繼續進行。
命令列中使用
redis-trib.rb reshard host:port --from <arg> --to <arg> --slots <arg> --yes --timeout <arg> --pipeline <arg>
其中,
host:port:必傳引數,叢集內任意節點地址,用來獲取整個叢集資訊。
–from:源節點id,如果有多個源節點,使用逗號分隔,如果是all,則源節點為叢集內出目標節點外的其它所有主節點。
–to:目標節點id,只能填寫一個。
–slots:需要遷移槽的總數量。
–yes:遷移無需使用者手動確認。
–timeout:控制每次migrate操作的超時時間,預設為60000毫秒。
–pipeline:控制每次批量遷移鍵的數量,預設為10。
如,
redis-trib.rb reshard --from a8b3d0f9b12d63dab3b7337d602245d96dd55844 --to f413fb7e6460308b17cdb71442798e1341b56cbc --slots 10923 --yes --pipeline 20 127.0.0.1:6383
平衡叢集節點slot數量
rebalance host:port --weight <arg> --auto-weights --use-empty-masters --timeout <arg> --simulate --pipeline <arg> --threshold <arg>
其中,
–weight <arg>:節點的權重,格式為node_id=weight,如果需要為多個節點分配權重的話,需要新增多個–weight <arg>引數,即–weight b31e3a2e=5 –weight 60b8e3a1=5,node_id可為節點名稱的字首,只要保證字首位數能唯一區分該節點即可。沒有傳遞–weight的節點的權重預設為1。
–auto-weights:自動將每個節點的權重預設為1。如果–weight和–auto-weights同時指定,則–auto-weights會覆蓋前者。
–threshold <arg>:只有節點需要遷移的slot閾值超過threshold,才會執行rebalance操作。
–use-empty-masters:預設沒有分配slot節點的master是不參與rebalance的。如果要讓其參與rebalance,需新增該引數。
–timeout <arg>:設定migrate命令的超時時間。
–simulate:設定該引數,只會提示使用者會遷移哪些slots,而不會執行真正的遷移操作。
–pipeline <arg>:定義cluster getkeysinslot命令一次取出的key數量,不傳的話使用預設值為10。
如,
# redis-trib.rb rebalance --weight a8b3d0f9b12d63dab3b7337d602245d96dd55844=3 --weight f413fb7e6460308b17cdb71442798e1341b56cbc=2 --use-empty-masters 127.0.0.1:6379 /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated >>> Performing Cluster Check (using node 127.0.0.1:6379) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. >>> Rebalancing across 2 nodes. Total weight = 5.0 Moving 3824 slots from 127.0.0.1:6380 to 127.0.0.1:6381 #########################################...
刪除節點
redis-trib.rb del-node host:port node_id
在刪除節點之前,其對應的槽必須為空,所以,在進行節點刪除動作之前,必須使用redis-trib.rb reshard將其遷移出去。
需要注意的是,如果某個節點的槽被完全遷移出去,其對應的slave也會隨著更新,指向遷移的目標節點。
# redis-trib.rb del-node 127.0.0.1:6379 8f7836a9a14fb6638530b42e04f5e58e28de0a6c >>> Removing node 8f7836a9a14fb6638530b42e04f5e58e28de0a6c from cluster 127.0.0.1:6379 /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated >>> Sending CLUSTER FORGET messages to the cluster... >>> SHUTDOWN the node.
新增新節點
redis-trib add-node new_host:new_port existing_host:existing_port --slave --master-id <arg>
其中,
new_host:new_port:待新增的節點,必須確保其為空或不在其它叢集中。否則,會提示以下錯誤。
[ERR] Node 127.0.0.1:6379 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0.
所以,線上建議使用redis-trib.rb新增新節點,因為其會對新節點的狀態進行檢查。如果手動使用cluster meet命令加入已經存在於其它叢集的節點,會造成被加入節點的叢集合併到現有叢集的情況,從而造成資料丟失和錯亂,後果非常嚴重,線上謹慎操作。
existing_host:existing_port:叢集中任意一個節點的地址。
如果新增的是主節點,只需指定源節點和目標節點的地址即可。
redis-trib.rb add-node 127.0.0.1:6379 127.0.0.1:6384
如果新增的是從節點,其語法如下,
redis-trib.rb add-node --slave --master-id f413fb7e6460308b17cdb71442798e1341b56cbc 127.0.0.1:6379 127.0.0.1:6384
注意:–slave和–master-id必須寫在前面,同樣的引數,如果是下面這樣寫法,會提示錯誤,
# redis-trib.rb add-node 127.0.0.1:6379 127.0.0.1:6384 --slave --master-id f413fb7e6460308b17cdb71442798e1341b56cbc [ERR] Wrong number of arguments for specified sub command
新增從節點,可不設定–master-id,此時會隨機選擇主節點。
設定節點的超時時間
redis-trib.rb set-timeout host:port milliseconds
其實就是批量修改叢集各節點的cluster-node-timeout引數。
# redis-trib.rb set-timeout 127.0.0.1:6379 20000 /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated >>> Reconfiguring node timeout in every cluster node... *** New timeout set for 127.0.0.1:6379 *** New timeout set for 127.0.0.1:6383 *** New timeout set for 127.0.0.1:6381 *** New timeout set for 127.0.0.1:6382 *** New timeout set for 127.0.0.1:6384 *** New timeout set for 127.0.0.1:6380 >>> New node timeout set. 6 OK, 0 ERR.
在叢集所有節點上執行命令
redis-trib.rb call host:port command arg arg .. arg
如,
[root@slowtech conf]# redis-trib.rb call 127.0.0.1:6379 set hello world /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated >>> Calling SET hello world 127.0.0.1:6379: MOVED 866 127.0.0.1:6381 127.0.0.1:6383: MOVED 866 127.0.0.1:6381 127.0.0.1:6381: OK 127.0.0.1:6382: MOVED 866 127.0.0.1:6381 127.0.0.1:6384: MOVED 866 127.0.0.1:6381 127.0.0.1:6380: MOVED 866 127.0.0.1:6381 [root@slowtech conf]# redis-trib.rb call 127.0.0.1:6379 get hello /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated >>> Calling GET hello 127.0.0.1:6379: MOVED 866 127.0.0.1:6381 127.0.0.1:6383: MOVED 866 127.0.0.1:6381 127.0.0.1:6381: world 127.0.0.1:6382: MOVED 866 127.0.0.1:6381 127.0.0.1:6384: MOVED 866 127.0.0.1:6381 127.0.0.1:6380: MOVED 866 127.0.0.1:6381
將外部redis資料匯入叢集
redis-trib.rb import --from 127.0.0.1:6378 127.0.0.1:6379
其內部處理流程如下:
1> 通過load_cluster_info_from_node方法載入叢集資訊,check_cluster方法檢查叢集是否健康。
2> 連線外部redis節點,如果外部節點開啟了cluster_enabled,則提示錯誤([ERR] The source node should not be a cluster node.)
3> 通過scan命令遍歷外部節點,一次獲取1000條資料。
4> 遍歷這些key,計算出key對應的slot。
5> 執行migrate命令,源節點是外部節點,目的節點是叢集slot對應的節點,如果設定了–copy引數,則傳遞copy引數,其會保留源節點的key,如果設定了–replace,則傳遞replace引數。如果目標節點中存在同名key,其值會被覆蓋。兩個引數可同時指定。
6> 不停執行scan命令,直到遍歷完所有key。
7> 遷移完成。
[root@slowtech conf]# redis-trib.rb import --from 127.0.0.1:6378 --replace 127.0.0.1:6379 >>> Importing data from 127.0.0.1:6378 to cluster /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-3.3.0/lib/redis/client.rb:459: warning: constant ::Fixnum is deprecated >>> Performing Cluster Check (using node 127.0.0.1:6379) S: d826c5fd98efa8a17a880e9a90a25f06c88e6ae9 127.0.0.1:6379 slots: (0 slots) slave replicates a8b3d0f9b12d63dab3b7337d602245d96dd55844 S: 55c05d5b0dfea0d52f88548717ddf24975268de6 127.0.0.1:6383 slots: (0 slots) slave replicates a8b3d0f9b12d63dab3b7337d602245d96dd55844 M: f413fb7e6460308b17cdb71442798e1341b56cbc 127.0.0.1:6381 slots:50-16383 (16334 slots) master 2 additional replica(s) S: beba753c5a63607fa66d9ec7427ed9a511ea136e 127.0.0.1:6382 slots: (0 slots) slave replicates f413fb7e6460308b17cdb71442798e1341b56cbc S: 83797d518e56c235272402611477f576973e9d34 127.0.0.1:6384 slots: (0 slots) slave replicates f413fb7e6460308b17cdb71442798e1341b56cbc M: a8b3d0f9b12d63dab3b7337d602245d96dd55844 127.0.0.1:6380 slots:0-49 (50 slots) master 2 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. >>> Connecting to the source Redis instance *** Importing 1 keys from DB 0 Migrating key5 to 127.0.0.1:6381: OK
參考
1. redis cluster管理工具redis-trib.rb詳解