Redis 叢集伸縮原理

LB477發表於2021-05-14

Redis 節點分別維護自己負責的槽和對應的資料。伸縮原理:Redis 槽和對應資料在不同節點之間移動

環境:CentOS7 搭建 Redis 叢集

一、叢集擴容

1. 手動擴容

(1) 準備節點 9007,並加入叢集

192.168.11.40:9001> cluster meet 192.168.11.40 9007

【注意】若 cluster meet 加入已存在於其它叢集的節點,會導致叢集合併,造成資料錯亂!。建議使用 redis-cli 的 add-node:

# 若節點已加入其它叢集或包含資料,會報錯
add-node    new_host:new_port existing_host:existing_port
            --cluster-slave  # 直接新增為從節點
            --cluster-master-id <arg>  # 從節點對應的主節點id

(2) 遷移槽和資料

  • 槽在遷移過程中叢集可以正常提供讀寫服務
  • 首先確定原有節點的哪些槽需要遷移到新節點。確保每個節點負責相似數量的槽,保證各節點的資料均勻
  • 槽是 Redis 叢集管理資料的基本單位。資料遷移是逐槽進行

槽遷移流程:

槽遷移流程

  • 目標節點準備匯入槽的資料:目標節點執行cluster setslot {slot} importing {sourceNodeId}
  • 源節點準備遷出槽的資料:源節點執行cluster setslot {slot} migrating {targetNodeId}
  • 獲取 count 個屬於槽 slot 的鍵:源節點執行cluster getkeysinslot {slot} {count}
  • 遷移鍵:源節點執行migrate {targetIp} {targetPort} "" 0 {timeout} keys {keys...},把鍵通過流水線(pipeline)機制批量遷移到目標節點。Redis3.0.6 後才支援批量遷移
  • 重複上兩步,直到槽下所有的鍵值資料遷移到目標節點
  • 向叢集所有主節點通知槽被分配給目標節點:叢集內所有主節點執行cluster setslot {slot} node {targetNodeId}

內部虛擬碼:

def move_slot(source,target,slot):
    # 目標節點準備匯入槽
    target.cluster("setslot",slot,"importing",source.nodeId);
    # 目標節點準備全出槽
    source.cluster("setslot",slot,"migrating",target.nodeId);
    while true :
        # 批量從源節點獲取鍵
        keys = source.cluster("getkeysinslot",slot,pipeline_size);
        if keys.length == 0:
            # 鍵列表為空時,退出迴圈
            break;
        # 批量遷移鍵到目標節點
        source.call("migrate",target.host,target.port,"",0,timeout,"keys",keys);
        # 向叢集所有主節點通知槽被分配給目標節點
        for node in nodes:
            if node.flag == "slave":
                continue;
            node.cluster("setslot",slot,"node",target.nodeId);

(3) 將 9001 的槽 4096 遷移到 9007 中

準備資料

192.168.11.40:9001> set key:test:5028 value:5028
192.168.11.40:9001> set key:test:68253 value:68253

目標節點準備工作

192.168.11.40:9007> cluster nodes
8ccdb0963411ebd05ce21952bdd4b7597825afdc 192.168.11.40:9001@19001 master - 0 1620928869000 2 connected 0-5461
bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d 192.168.11.40:9007@19007 myself,master - 0 1620928868000 0 connected
...
# 9007 準備匯入槽 4096 的資料
192.168.11.40:9007> cluster setslot 4096 importing 8ccdb0963411ebd05ce21952bdd4b7597825afdc
OK
# 槽 4096 已開啟匯入狀態
192.168.11.40:9007> cluster nodes
bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d 192.168.11.40:9007@19007 myself,master - 0 1620928959000 0 connected [4096-<-8ccdb0963411ebd05ce21952bdd4b7597825afdc]
...

源節點準備工作

# 9001 準備匯出槽 4096 資料
192.168.11.40:9001> cluster setslot 4096 migrating bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d
OK
# 槽 4096 已開啟匯出狀態
192.168.11.40:9001> cluster nodes
8ccdb0963411ebd05ce21952bdd4b7597825afdc 192.168.11.40:9001@19001 myself,master - 0 1620929179000 2 connected 0-5461 [4096->-bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d]
...

匯出資料

# 獲取 100 個屬於槽 4096 的鍵
192.168.11.40:9001> cluster getkeysinslot 4096 100
1) "key:test:5028"
2) "key:test:68253"
# 檢視資料
192.168.11.40:9001> mget key:test:5028 key:test:68253
1) "value:5028"
2) "value:68253"
# 遷移這2個鍵:migrate 命令保證了每個鍵遷移過程的原子性
192.168.11.40:9001> migrate 192.168.11.40 9007 "" 0 5000 keys key:test:5028 key:test:68253
OK
# 再次查詢會報 ASK 錯誤:引導客戶端找到資料所在的節點
192.168.11.40:9001> mget key:test:5028 key:test:68253
(error) ASK 4096 192.168.11.40:9007

通知所有主節點:槽 4096 指派給 9007

192.168.11.40:9001> cluster setslot 4096 node bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d
192.168.11.40:9002> cluster setslot 4096 node bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d
192.168.11.40:9003> cluster setslot 4096 node bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d
192.168.11.40:9007> cluster setslot 4096 node bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d

檢視最終結果

192.168.11.40:9007> cluster nodes
8ccdb0963411ebd05ce21952bdd4b7597825afdc 192.168.11.40:9001@19001 master - 0 1620931743303 7 connected 0-4095 4097-5461
bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d 192.168.11.40:9007@19007 myself,master - 0 1620931741000 8 connected 4096
...

2. 使用 redis-cli 擴容

redis-cli 提供了槽重分片功能

reshard 命令引數詳解:

reshard    host:port  # 叢集內任意節點地址
           --cluster-from <arg>  # 源節點id,逗號分隔
           --cluster-to <arg>  # 目標節點id,只有一個
           --cluster-slots <arg>  # 遷移多少個槽
           --cluster-yes  # 確認執行reshard
           --cluster-timeout <arg>  # 每次 migrate 操作的超時時間,預設 60000ms
           --cluster-pipeline <arg>  # 每次批量遷移鍵的數量,預設 10
           --cluster-replace

將 9001、9002、9003 的槽遷移到 9007,共遷移 4096 個

$ /usr/local/redis/bin/redis-cli --cluster reshard 192.168.11.40:9001
M: 8ccdb0963411ebd05ce21952bdd4b7597825afdc 192.168.11.40:9001
   slots:[0-4095],[4097-5461] (5461 slots) master
   1 additional replica(s)
M: bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d 192.168.11.40:9007
   slots:[4096] (1 slots) master
...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 4096
What is the receiving node ID? bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1: 8ccdb0963411ebd05ce21952bdd4b7597825afdc
Source node #2: 5786e3237c7fa413ed22465d15be721f95e72cfa
Source node #3: 85ceb9826e8aa003169c46fb4ba115c72002d4f9
Source node #4: done
    Moving slot 0 from 8ccdb0963411ebd05ce21952bdd4b7597825afdc
    ...
    Moving slot 12287 from 85ceb9826e8aa003169c46fb4ba115c72002d4f9
Do you want to proceed with the proposed reshard plan (yes/no)? yes
Moving slot 0 from 192.168.11.40:9001 to 192.168.11.40:9007:
...
Moving slot 12287 from 192.168.11.40:9003 to 192.168.11.40:9007:

檢視最終結果

192.168.11.40:9007> cluster nodes
8ccdb0963411ebd05ce21952bdd4b7597825afdc 192.168.11.40:9001@19001 master - 0 1620933907753 7 connected 1366-4095 4097-5461
5786e3237c7fa413ed22465d15be721f95e72cfa 192.168.11.40:9002@19002 master - 0 1620933906733 1 connected 6827-10922
85ceb9826e8aa003169c46fb4ba115c72002d4f9 192.168.11.40:9003@19003 master - 0 1620933905000 3 connected 12288-16383
bb1bb0f5f9e0ee67846ba8ec94a38da700e2e80d 192.168.11.40:9007@19007 myself,master - 0 1620933900000 8 connected 0-1365 4096 5462-6826 10923-12287
...

檢查節點之間槽的均衡性

$ /usr/local/redis/bin/redis-cli --cluster rebalance 192.168.11.40:9001
...
[OK] All 16384 slots covered.
*** No rebalancing needed! All nodes are within the 2.00% threshold.

遷移之後所有主節點負責的槽數量差異在 2% 以內,因此叢集節點資料相對均勻,無需調整

二、叢集收縮

節點下線流程

1. 遷移槽

執行 reshard 三次,將資料平均分佈到其他三個節點

2. 忘記節點

60s 內對所有節點執行如下操作:(不建議)

# 執行後,會將該節點加入禁用列表(持續 60s),不再向其傳送 Gossip 訊息
cluster forget {nodeId}

建議使用 redis-cli 的 del-node 忘記節點:

/usr/local/redis/bin/redis-cli --cluster del-node {host:port} {nodeId}

內部虛擬碼

def delnode_cluster_cmd(downNode):
    # 下線節點不允許包含slots
    if downNode.slots.length != 0
        exit 1
    end
    # 向叢集內節點傳送cluster forget
    for n in nodes:
        if n.id == downNode.id:
            # 不能對自己做forget操作
            continue;
        # 如果下線節點有從節點則把從節點指向其他主節點
        if n.replicate && n.replicate.nodeId == downNode.id :
            # 指向擁有最少從節點的主節點
            master = get_master_with_least_replicas();
            n.cluster("replicate",master.nodeId);
        #傳送忘記節點命令
        n.cluster('forget',downNode.id)
    # 節點關閉
    downNode.shutdown();

若主從節點都要下線,先下線從,避免全量複製

相關文章