Redis基礎知識（學習筆記18--主從叢集）

东山絮柳仔發表於2024-07-15

原文網址 : https://www.cnblogs.com/xuliuzai/p/18300943

一.主從相關的配置

1.1 masterauth

# If the master is password protected (using the "requirepass" configuration
# directive below) it is possible to tell the replica to authenticate before
# starting the replication synchronization process, otherwise the master will
# refuse the replica request.
#
# masterauth <master-password>
#

因為我們要搭建主從叢集，且每個主機都有可能會是Master，如果設定驗證屬性requirepass。一定要每個主機的密碼都設定為相同的。此時每個配置檔案中都要設定兩個完全相同的屬性：requirepass與masterauth。其中requirepass使用者指定當前主機的訪問密碼，而masterauth用於指定當前slave訪問master時向master提交的訪問密碼，用於讓master驗證請求者身份是否合法。

1.2 repl-disable-tcp-nodelay

# Disable TCP_NODELAY on the replica socket after SYNC?
#
# If you select "yes" Redis will use a smaller number of TCP packets and
# less bandwidth to send data to replicas. But this can add a delay for
# the data to appear on the replica side, up to 40 milliseconds with
# Linux kernels using a default configuration.
#
# If you select "no" the delay for data to appear on the replica side will
# be reduced but more bandwidth will be used for replication.
#
# By default we optimize for low latency, but in very high traffic conditions
# or when the master and replicas are many hops away, turning this to "yes" may
# be a good idea.
repl-disable-tcp-nodelay no

該屬性用於設定是否禁用TCP特性tcp-nodelay。設定為yes則禁用tcp-nodelay，此時master與slave間的通訊會產生延遲，但使用的TCP包數量較少，佔用的網路頻寬會較小。相反，如果設定為no，則網路延遲會變小，但使用的TCP包數量會較多，相應占用的網路頻寬會變大。

知識點補充---tcp-nodelay

為了充分複用網路頻寬，TCP總是希望傳送儘可能大的資料塊。為了達到該目的，TCP中使用了一個名為Nagle的演算法。

Nagle演算法的原理是，網路在接收到要傳送的資料後，並不直接傳送，而是等待著資料量足夠大（由TCP網路特性決定）時再一次性傳送出去。這樣，網路上傳輸的有效資料的比例就等到了大大提升，無效資料的傳輸極大減少，於是就節省了網路頻寬，緩解了網路壓力。

tcp-nodelay 則是TCP協議中Nagle演算法的開關。

1.3 pidfile

# If a pid file is specified, Redis writes it where specified at startup
# and removes it at exit.
#
# When the server runs non daemonized, no pid file is created if none is
# specified in the configuration. When the server is daemonized, the pid file
# is used even if not specified, defaulting to "/var/run/redis.pid".
#
# Creating a pid file is best effort: if Redis is not able to create it
# nothing bad happens, the server will start and run normally.
#
# Note that on modern Linux systems "/run/redis.pid" is more conforming
# and should be used instead.
pidfile /var/run/redis_6379.pid

如果是多例項安裝（一臺機器上安裝多個redis例項），記得要修改這個引數。

當然，另外一些引數配置

埠號（port）、dbfilename、appendfilename、logfile、replica-priority

簡單說下 replica-priority

# The replica priority is an integer number published by Redis in the INFO
# output. It is used by Redis Sentinel in order to select a replica to promote【提升；晉升；】
# into a master if the master is no longer working correctly.
#
# A replica with a low priority number is considered better for promotion, so  ##越小優先順序越高
# for instance if there are three replicas with priority 10, 100, 25 Sentinel
# will pick the one with priority 10, that is the lowest.
#
# However a special priority of 0 marks the replica as not able to perform the ##特殊的優先順序的值為0
# role of master, so a replica with priority of 0 will never be selected by    ##0喪失了稱為master的可能性
# Redis Sentinel for promotion.
#
# By default the priority is 100.
replica-priority 100

1.4 個性化的配置依賴引數include

################################## INCLUDES ###################################

# Include one or more other config files here.  This is useful if you
# have a standard template that goes to all Redis servers but also need
# to customize a few per-server settings.  Include files can include
# other files, so use this wisely.
#
# Note that option "include" won't be rewritten by command "CONFIG REWRITE"
# from admin or Redis Sentinel. Since Redis always uses the last processed
# line as value of a configuration directive, you'd better put includes
# at the beginning of this file to avoid overwriting config change at runtime.
#
# If instead you are interested in using includes to override configuration
# options, it is better to use include as the last line.
#
# Included paths may contain wildcards. All files matching the wildcards will
# be included in alphabetical order.
# Note that if an include path contains a wildcards but no files match it when
# the server is started, the include statement will be ignored and no error will
# be emitted.  It is safe, therefore, to include wildcard files from empty
# directories.
#
# include /path/to/local.conf
# include /path/to/other.conf
# include /path/to/fragments/*.conf
#

################################## MODULES #####################################

例如我們想獨立出一個配置檔案，但是呢，只想修改幾個或者少部分引數項，這時候，可以include進基本的配置檔案，只把需要修改的引數，重寫下即可。

二. 設定主從關係

2.1 檢視

先檢視下主從關係，檢視指令

> info replication

在主節點上執行，

返回值 role 代表當前節點的角色；

connected_slaves的數值代表從節點的個數；

如果有slave節點的話，會以slave0、slave1 呈現出具體的slave資訊（ip:port:state:offset:lag）。

而在從節點上執行的話，返回值是不一樣的:

返回值role 代表叢集角色，其他的返回值還有master_ip、master_port、master_link_status、master_last_io_seconds_age、master_sync_in_process、slave_read_only等等。

需要注意的是從節點是不可以執行寫命令的，否則報錯

（error）READONLY You can't write against a read only replica.

2.2 設定命令

在從節點上執行命令，如下

> slaveof host(主節點ip) port（主節點的埠號）

只執行上面的命令，如果從節點重啟的話，主從關係就會失效，即丟失已設定的主從關係。

2.3. 分級管理

若redis主從叢集中的slave較多時，他們的資料同步過程會對master形成較大的效能壓力。此時，可以對這些slave進行分級管理。

設定方式很簡單，只需要讓低階別slave指定其slaveof的主機為上一級slave即可。不過，上一級slave的狀態仍為slave,只不過，其是更上一級的slave。

調整主從關係，不需要關閉已有關係，再重建，而是直接執行 slaveof host port 進行調整即可。

2.4 容災冷處理

在master/slave的redis叢集中，若master出現了當機怎麼辦？有兩種處理方式，一種是透過手工角色調整，使slave晉升為master的冷處理；一種是使用哨兵模式，實現redis叢集的高可用HA，即熱處理。

無論master是否當機，slave都可以透過下面的命令，將自己提升為master。

> slaveof no one

如果其原本就有下一級的slave，那麼，其就直接變為了這些slave的真正的master了。而原來的master就會失去了這個原來的slave。

三. 主從複製原理

3.1 主從複製過程

當一個redis節點（slave）接收到類似slaveof 127.0..1 6379 的指令後直至其可以從master持續複製資料，大體經歷瞭如下幾個過程：

（1）儲存master地址

當slave接收到slaveof <master_ip> <master_port>指令後，slave會立即將新的master的地址儲存下來。

（2）建立連線

slave中維護著一個定時任務，該定時任務會嘗試著與該master建立socker連線。

（3）slave傳送ping命令

連線成功後，slave會傳送ping命令，進行首次通訊。如果slave沒有收到master的回覆，則slave就會主動斷開連線，下次的定時任務會重新嘗試連線。

（4）對slave身份驗證

master 在接收到slave的ping命令後，並不會立即對其進行回覆，而是先對Salve進行身份驗證。如果驗證不透過，則會傳送訊息拒絕連線；。驗證透過，master 向slave傳送連線成功響應。

（5）master持久化

slave在成功接收到master的響應後，slave向master發出資料同步請求。master在接收到資料同步請求後，fork出一個子程序，讓子程序以非同步方式立即進行資料持久化。

（6）資料傳送

持久化完畢後，master再fork出一個子程序，讓子程序以非同步方式將資料傳送給slave。slave會將接收到的資料不斷寫入到本地的持久化檔案中。

在slave資料同步過程中，master的主程序仍在不斷地接受著客戶端的寫操作，且不僅將新的資料寫入到master記憶體，同時也寫入到了同步快取。當master的持久化檔案中的資料傳送完畢後，master會再將同步快取中新的資料傳送給slave，由slave將其寫入到本地持久化檔案中。資料同步完成。

（7）slave恢復記憶體資料

資料同步完畢後，slave就會讀取本地持久化檔案，將其恢復到本地記憶體資料，然後就可以對外提供服務了。

（8）持續增量複製

對外服務中master持續接收到寫操作，會以增量方式傳送給slave，以保證主從資料的一致性。

流程概況如下

考慮到資料傳送過程中，仍由資料進來，補充如下：

3.2 資料同步演變過程

（1）sync 同步

redis 2.8 版本之前，首次通訊成功後，slave會向master傳送sync資料同步請求，然後master就會將其所有資料全部傳送給slave，由slave儲存到其本地的持久化檔案中。這個過程稱為全量複製。

但這裡存在一個問題：在全量複製過程中可能會出現網路抖動而導致複製過程中斷。當網路恢復後，slave與master重新連線成功，此時slave會重新傳送sync請求，然後會從頭開始全量複製。

由於全量複製過程非常耗時，所以期間出現網路抖動的機率很高。而中斷後的從頭開始不僅需要消耗大量的系統資源、網路頻寬，而且可能會出現長時間無法完成全量複製的情況。

（2）psync

redis 2.8 版本之後，全量複製採用了psync（Partial Sync，不完全同步）同步策略。當全量複製過程出現由於網路抖動而導致複製過程中斷時，當重新連線成功後，複製過程可以“斷點續傳”。即從斷點位置開始繼續複製，而不用從頭再來。這樣大大提升了效能。

為了實現psync，整個系統做了三個大的變化：

A. 複製偏移量

系統為每個需要傳送資料進行了編號，該編號從0開始，每個位元組一個編號。該編號稱為複製偏移量。參與複製的主從節點都會維護該複製偏移量。

可以透過命令info replication 的返回結果中的 slave_repl_offset （從節點）或 master_repl_offset（主節點代表已傳送出去的資料）值檢視。

B.主節點複製ID

當master啟動後，就會動態生成一個長度為40位的16進位制字串作為當前master的複製ID，該ID是在進行資料同步時slave識別master使用的。透過 info replication 的master_replid屬性可檢視到該ID。

特別注意：master redis 重啟，動態生成的複製ID就會變化。

C.複製積壓緩衝區

當master有連線的slave時，在master中就會建立並維護一個佇列backlog，預設大小為1MB，該佇列稱為複製積壓緩衝區。master接收到了寫操作，資料不僅會寫入到了master主存，寫入到了master中為每個slave配置的傳送快取，而且還會寫入到複製積壓緩衝區。其作用就是用於儲存最近操作的資料，以備“斷點續傳”時做資料補償，防止資料丟失。

D. psync 同步過程

psync是一個由slave提交的命令，其格式為psync <master_replid> <repl_offset> ，表示當前slave要從指定中的repl_offset+1處開始複製。 repl_offset表示當前slave已經完成複製的資料的offset。該命令保證了“斷點續傳”的實現。

在第一次開始複製時，slave並不直到master的動態ID，並且一定時從頭開始複製，所以其提交的psync命令為PSYNC ? -1。即master_replid 為問號（？），repl_offset為-1。

如果複製過程中斷後，slave與master成功連線，則save再次提交psync命令。此時psync命令的repl_offset引數為其前面已經完成複製的資料的偏移量。

其實，並不是slave提交psync命令後就可以立即從master處開始複製，而是需要master給出響應結果後，根據響應結果來執行。master根據slave提交的請求及master自身情況會給出不同的響應結果。響應結果有三種可能：

FULLRESYNC <master_replid> <repl_offset>：告知slave當前master的動態ID及可以開始全量複製了，這裡的repl_offset一般為0。
CONTINUE：告知slave可以按照你提交的repl_offset後面位置開始“續傳”了。
ERR：告知slave，當前master的版本低於redis 2.8 ,不支援psync，你可以開始全量複製。

psync過程概況如下

E. psync存在的問題

在psync資料同步過程中，若slave重啟，在slave記憶體中儲存的master的動態ID與續傳需要的offset都會消失，“斷點續傳”將無法進行，從而只能進行全量複製，導致資源浪費。
在psync資料同步過程中，master當機後slave會發生“易主”，從而導致slave需要從新master進行全量複製，形成資源浪費。

（3）psync 同步的改進

Redis 4.0 對psync進行了改進，提出了“同源增量同步”策略。

A. 解決slave重啟問題

針對“slave重啟時master動態ID丟失問題”，改進後的psync將master的動態ID直接寫入到了slave的持久化檔案中。

slave重啟後直接從本地持久化檔案中讀取master的動態ID，然後向master提交獲取複製偏移量的請求。master會根據提交請求的slave地址，查詢到儲存在master中的複製偏移量，然後向slave回覆FULLRESYNC <master_replid> <repl_offset>，以告知slave其馬上要開始傳送的位置。然後master開始“斷點續傳”。

B. 解決slave易主問題

slave易主後需要和新master進行全量複製，本質原因是新master不認識slave提交的psync請求中的“原master的動態ID”。如果slave傳送psync <原master_replid> <repl_offset> 命令，新的master能夠識別出該slave要從原master複製資料，而自己的資料都是從該master複製來的。那麼新master就會明白，其與該slave"師出同門"，應該接收其“斷點續傳”同步請求。

而新master中恰好儲存的有“原master的動態ID”。由於改進後的psync中每個slave都在本地儲存了當前的master的動態ID，所以當slave晉升為新的master後，其本地仍儲存有之前master的動態ID。而這一點也恰恰為解決“slave易主”問題提供了條件。透過master的info replication 中master_replid2 可以檢視到。如果尚未傳送易主，則該值為40個0。

(4) 無盤操作

Redis 6.0 對同步過程又進行了改進，提出了“無盤全量同步”與“無盤載入”策略，避免了耗時的IO操作。

無盤全量同步：master的主程序fork出的子程序直接將記憶體中的資料傳送給slave,無需經過磁碟。
無盤載入：slave在接收到master傳送來的資料後不需要將其寫入到磁碟檔案，而是直接寫入到記憶體，這樣slave就可快速完成資料恢復。

(5) 共享複製積壓緩衝區

Redis 7.0 版本對複製積壓緩衝區進行了改進，讓各個slave的傳送緩衝區共享複製積壓緩衝區。這使得複製積壓緩衝區的作用，除了可以保障資料的安全性外，還作為所有slave的傳送緩衝區，充分利用了複製積壓緩衝區。

學習參閱特別宣告

【Redis影片從入門到高階】

【https://www.bilibili.com/video/BV1U24y1y7jF?p=11&vd_source=0e347fbc6c2b049143afaa5a15abfc1c】

Redis學習筆記七：主從叢集
2021-05-28
Redis筆記
Redis基礎知識（學習筆記5--Redis Cluster）
2024-06-16
Redis筆記
Redis基礎知識（學習筆記19--Redis Sentinel）
2024-07-18
Redis筆記
Redis基礎知識（學習筆記3--Redlock）
2024-06-14
Redis筆記
Redis基礎知識（學習筆記11--SDS）
2024-06-30
Redis筆記
Redis基礎知識（學習筆記8--Redis命令（1））
2024-06-25
Redis筆記
Redis基礎知識（學習筆記9--Redis命令（2））
2024-06-28
Redis筆記
Redis基礎知識（學習筆記10--Redis命令（3））
2024-06-30
Redis筆記
基礎知識學習筆記
2020-09-28
筆記
RxJava 學習筆記 -- 基礎知識
2018-09-06
RxJava筆記
Redis基礎知識（學習筆記2--分散式鎖）
2024-06-12
Redis筆記分散式
Redis基礎知識（學習筆記16--持久化 (2)）
2024-07-12
Redis筆記持久化
Redis基礎知識（學習筆記17--持久化 (3)）
2024-07-13
Redis筆記持久化
Redis基礎知識（學習筆記15--持久化 (1)）
2024-07-09
Redis筆記持久化
基於Dokcer搭建Redis叢集（主從叢集）
2020-12-10
Redis
Redis基礎知識（學習筆記22--分散式鎖 Redisson ）
2024-09-23
Redis筆記分散式
Redis基礎知識（學習筆記14--釋出/訂閱）
2024-07-07
Redis筆記
Redis學習筆記（十八）叢集（下）
2020-06-10
Redis筆記
Redis學習筆記（十七）叢集（上）
2020-06-09
Redis筆記
Redis學習筆記八：叢集模式
2021-05-31
Redis筆記模式
Redis基礎知識（學習筆記1--五種基礎資料結構）
2024-06-05
Redis筆記資料結構
Redis基礎知識（學習筆記21--Lua 指令碼語言）
2024-09-04
Redis筆記指令碼
Redis基礎知識（學習筆記4--高併發問題）
2024-06-15
Redis筆記
Redis基礎知識（學習筆記6--執行緒IO模型）
2024-06-22
Redis筆記執行緒模型
C++基礎知識學習筆記（1）
2024-05-20
C++筆記
C++基礎知識學習筆記（3）
2024-05-25
C++筆記
GO 學習筆記《1. 基礎知識》
2021-08-30
Go筆記
Redis面試高頻45題筆記：基礎+快取雪崩+哨兵+叢集+Reids學習筆記
2019-06-06
Redis面試筆記快取
Redis基礎知識（學習筆記21--Lua 指令碼語言2）
2024-09-08
Redis筆記指令碼
Python學習筆記—day1—基礎知識
2019-02-27
Python筆記
LiteOS學習筆記[01]-weharmonyos-基礎知識
2024-11-06
筆記
資料庫學習筆記 - MySQL基礎知識
2021-11-16
資料庫筆記MySql
Redis基礎知識（學習筆記7--關鍵引數配置說明）
2024-06-23
Redis筆記
Redis基礎知識（學習筆記12--集合的底層實現原理）
2024-07-02
Redis筆記
Redis基礎知識（學習筆記13--BitMap、HyperLogLog 與 Geospatial的操作命令）
2024-07-07
Redis筆記
Redis主從同步叢集搭建
2020-12-21
Redis主從同步
技術分享 | Kubernetes 學習筆記之基礎知識篇
2021-10-19
筆記
kafka 基礎知識梳理及叢集環境部署記錄
2018-05-10
Kafka