【Mongodb】Replica Set 的選舉策略之三

楊奇龍發表於2011-11-06
承接之前的文章繼續介紹replica set 選舉機制。
建立兩節點的Replica Sets,一主一備secondary,如果Secondary當機,Primary會變成Secondary!這時候叢集裡沒有Primary了!為什麼會出現這樣的情況呢。
[mongodb@rac4 bin]$ mongo 127.0.0.1:27018 init1node.js 
MongoDB shell version: 2.0.1
connecting to: 127.0.0.1:27018/test
[mongodb@rac4 bin]$ ./mongo 127.0.0.1:27019
MongoDB shell version: 2.0.1
connecting to: 127.0.0.1:27019/test
RECOVERING> 
SECONDARY> 
SECONDARY> use admin
switched to db admin
SECONDARY> db.shutdownServer() 
Sun Nov  6 20:16:11 DBClientCursor::init call() failed
Sun Nov  6 20:16:11 query failed : admin.$cmd { shutdown: 1.0 } to: 127.0.0.1:27019
server should be down...
Sun Nov  6 20:16:11 trying reconnect to 127.0.0.1:27019
Sun Nov  6 20:16:11 reconnect 127.0.0.1:27019 failed couldn't connect to server 127.0.0.1:27019
Sun Nov  6 20:16:11 Error: error doing query: unknown shell/collection.js:150
secondary 當機之後,主庫有PRIMARY變為SECONDARY
[mongodb@rac4 bin]$ mongo 127.0.0.1:27018 
MongoDB shell version: 2.0.1
connecting to: 127.0.0.1:27018/test
PRIMARY> 
PRIMARY> 
PRIMARY> 
SECONDARY> 
從日誌中可以看出:從庫down了之後,主庫的變化
Sun Nov  6 20:16:13 [rsHealthPoll] replSet info 10.250.7.220:27019 is down (or slow to respond): DBClientBase::findN: transport error: 10.250.7.220:27019 query: { replSetHeartbeat: "myset", v: 1, pv: 1, checkEmpty: false, from: "10.250.7.220:27018" }
Sun Nov  6 20:16:13 [rsHealthPoll] replSet member 10.250.7.220:27019 is now in state DOWN
Sun Nov  6 20:16:13 [conn7] end connection 10.250.7.220:13217
Sun Nov  6 20:16:37 [rsMgr] can't see a majority of the set, relinquishing primary
Sun Nov  6 20:16:37 [rsMgr] replSet relinquishing primary state
Sun Nov  6 20:16:37 [rsMgr] replSet SECONDARY
這是和MongoDB的Primary選舉策略有關的,如果情況不是Secondary當機,而是網路斷開,那麼兩個節點都會選取自己為Primary,因為他們能連線上的只有自己這一個節點。而這樣的情況在網路恢復後就需要處理複雜的一致性問題。而且斷開的時間越長,時間越複雜。所以MongoDB選擇的策略是如果叢集中只有自己一個節點,那麼不選取自己為Primary。
所以正確的做法應該是新增兩個以上的節點,或者新增arbiter,當然最好也最方便的做法是新增arbiter,aribiter節點只參與選舉,幾乎不會有壓力,所以你可以在各種閒置機器上啟動arbiter節點,這不僅會避免上面說到的無法選舉Primary的情況,更會讓選取更快速的進行。因為如果是三臺資料節點,一個節點當機,另外兩個節點很可能會各自選舉自己為Primary,從而導致很長時間才能得出選舉結果。實際上叢集選舉主庫上由優先順序和資料的新鮮度這兩個條件決定的。
官方文件:
Example: if B and C are candidates in an election, B having a higher priority but C being the most up to date:
1 C will be elected primary
2 Once B catches up a re-election should be triggered and B (the higher priority node) should win the election between B and C
3 Alternatively, suppose that, once B is within 12 seconds of synced to C, C goes down.
B will be elected primary.
When C comes back up, those 12 seconds of unsynced writes will be written to a file in the rollback directory of your data directory (rollback is created when needed).
You can manually apply the rolled-back data, see Replica Sets - Rollbacks.
重新搭建replica set 叢集不過這次加上仲裁者:
[mongodb@rac4 bin]$ cat init2node.js 
rs.initiate({
    _id : "myset",
    members : [
        {_id : 0, host : "10.250.7.220:28018"},
        {_id : 1, host : "10.250.7.220:28019"},
        {_id : 2, host : "10.250.7.220:28020", arbiterOnly: true}
    ]
})
[mongodb@rac4 bin]$ ./mongo 127.0.0.1:28018 init2node.js 
[mongodb@rac4 bin]$ ./mongo 127.0.0.1:28018 
MongoDB shell version: 2.0.1
connecting to: 127.0.0.1:28018/test
PRIMARY> rs.status()
{
        "set" : "myset",
        "date" : ISODate("2011-11-06T14:16:13Z"),
        "myState" : 1,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "10.250.7.220:28018",
                        "health" : 1,
                        "state" : 1,
...
                },
                {
                        "_id" : 1,
                        "name" : "10.250.7.220:28019",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
....
                },
                {
                        "_id" : 2,
                        "name" : "10.250.7.220:28020",
                        "health" : 1,
                        "state" : 7,
                        "stateStr" : "ARBITER",
....
                }
        ],
        "ok" : 1
}
PRIMARY> 
再次測試,測試主庫變成secondary節點。
對於前一篇文章多節點的,比如4個primary,secondary節點,一個仲裁者,當兩個節點down了之後,不會出現的文章說的down 1/2的機器整個叢集不可用,但是如果down 3/4的機器時,整個叢集將不可用!
日誌記錄中描述的 “majority of” 並沒有給出一個具體的數值,目前所做的實驗是多於1/2的時候,整個叢集就不可用了
Sun Nov  6 19:34:16 [rsMgr] can't see a majority of the set, relinquishing primary 

參考文章:
http://blog.nosqlfan.com/html/2523.html

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/22664653/viewspace-710335/,如需轉載,請註明出處,否則將追究法律責任。

相關文章