mongodb叢集節點故障的切換方法
官方文件
主節點全部失敗了,涉及Config Server、Shard Server,這個時候,透過mongos可以連線到副節點,但是執行操作時會報錯, 說連不上shard2的primay主節點
"errmsg" : "Could not find host matching read preference { mode: \"primary\", tags: [ {} ] } for set shard2"
處理方法
方法1、一般會自動切換,如果shard是由3個節點組成的replica set,主節無法連線到另外兩個副節點時,和rs.conf()裡面的electionTimeoutMillis值相關預設是10秒,兩個副節點自動會有選擇一個節點成為主節點;如果shard是由2個節點組成的replica set,把主節點啟動後該主節點自動變成了副節點,原副節點自動變成了新的主節點
方法2、如果主節點機器完全無法使用了,需要對某個shard的副本節點執行切換,切換到primay狀態,mongo命令進入這個shard的輔助節點,檢視rs.conf()配置資訊,並檢視rs.status()狀態,看到主節點無法連線,使用rs.reconfig()重新配置,rs.reconfig括號裡面的內容參考rs.conf()的資訊,去掉primay節點的資訊,並新增{ "force": true }。之後退出重新登入這個shard,發現副本切換到了primay狀態,重新登入mongos發現正常了
使用mongo登入mongod例項執行rs.status()或db.isMaster()可以看到誰是主節點,誰是副節點
mongos例項執行rs.status()會報錯replSetGetStatus is not supported through mongos
mongos例項執行db.isMaster()正常,預設連線到主節點
實驗案例
1、切換到TDB6執行show tables報錯,說這個資料庫對應的shard的primay節點無法找到
mongo --host 172.22.138.157 --port 27001
mongos> use TDB6
mongos> show tables
2019-06-20T00:16:17.935-0700 E QUERY [thread1] Error: listCollections failed: {
"ok" : 0,
"errmsg" : "Could not find host matching read preference { mode: \"primary\", tags: [ {} ] } for set shard28003",
2、只能進入這個shard的輔助節點,檢視rs.conf()配置資訊,並檢視rs.status()狀態,看到主節點無法連線,使用rs.reconfig()重新配置,rs.reconfig括號裡面的內容參考rs.conf()的資訊,去掉primay節點的資訊,並新增{ "force": true }
mongo --host 172.22.138.157 --port 28003
shard28003:SECONDARY>rs.conf()
{
"_id" : "shard28003",
"version" : 1,
"protocolVersion" : NumberLong(1),
"members" : [
{
"_id" : 0,
"host" : "172.22.138.157:28003",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
},
{
"_id" : 1,
"host" : "172.22.138.158:28003",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
}
],
"settings" : {
"chainingAllowed" : true,
"heartbeatIntervalMillis" : 2000,
"heartbeatTimeoutSecs" : 10,
"electionTimeoutMillis" : 10000,
"catchUpTimeoutMillis" : -1,
"catchUpTakeoverDelayMillis" : 30000,
"getLastErrorModes" : {
},
"getLastErrorDefaults" : {
"w" : 1,
"wtimeout" : 0
},
"replicaSetId" : ObjectId("5d09d2c89fb43c4506d995ac")
}
}
shard28003:SECONDARY> rs.status()
{
"set" : "shard28003",
"date" : ISODate("2019-06-20T07:25:07.438Z"),
"myState" : 2,
"term" : NumberLong(2),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1561001587, 1),
"t" : NumberLong(2)
},
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1561001587, 1),
"t" : NumberLong(2)
},
"appliedOpTime" : {
"ts" : Timestamp(1561001587, 1),
"t" : NumberLong(2)
},
"durableOpTime" : {
"ts" : Timestamp(1561001587, 1),
"t" : NumberLong(2)
}
},
"members" : [
{
"_id" : 0,
"name" : "172.22.138.157:28003",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 15254,
"optime" : {
"ts" : Timestamp(1561001587, 1),
"t" : NumberLong(2)
},
"optimeDate" : ISODate("2019-06-20T03:33:07Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "could not find member to sync from",
"configVersion" : 1,
"self" : true,
"lastHeartbeatMessage" : ""
},
{
"_id" : 1,
"name" : "172.22.138.158:28003",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDurable" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2019-06-20T07:25:07.121Z"),
"lastHeartbeatRecv" : ISODate("2019-06-20T03:33:08.040Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "Connection refused",
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"configVersion" : -1
}
],
"ok" : 1,
"operationTime" : Timestamp(1561001587, 1),
"$gleStats" : {
"lastOpTime" : Timestamp(0, 0),
"electionId" : ObjectId("000000000000000000000000")
},
"$configServerState" : {
"opTime" : {
"ts" : Timestamp(1561001589, 1),
"t" : NumberLong(2)
}
},
"$clusterTime" : {
"clusterTime" : Timestamp(1561015481, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
shard28003:SECONDARY> rs.reconfig({
"_id" : "shard28003",
"version" : 1,
"protocolVersion" : NumberLong(1),
"members" : [
{
"_id" : 0,
"host" : "172.22.138.157:28003",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
}
],
"settings" : {
"chainingAllowed" : true,
"heartbeatIntervalMillis" : 2000,
"heartbeatTimeoutSecs" : 10,
"electionTimeoutMillis" : 10000,
"catchUpTimeoutMillis" : -1,
"catchUpTakeoverDelayMillis" : 30000,
"getLastErrorModes" : {
},
"getLastErrorDefaults" : {
"w" : 1,
"wtimeout" : 0
},
"replicaSetId" : ObjectId("5d09d2c89fb43c4506d995ac")
}
},
{ "force": true }
)
3、退出後,重新登入,發現副本切換到了primay狀態
mongo --host 172.22.138.157 --port 28003
shard28003:PRIMARY>
4、重新登入mongos,發現可以正常操作show tables,但是無法db.createCollection,因為Config Server的replica set的主節點壞了,重新按上面1、2、3方法切換Config Server的replica set的副節點為主節點就好了
mongo --host 172.22.138.157 --port 27001
mongos> show tables
test06
mongos> use testdb2
switched to db testdb2
mongos> db.createCollection("table1")
{
"ok" : 0,
"errmsg" : "Database testdb2 not found due to Could not confirm non-existence of database testdb 2 due to Could not find host matching read preference { mode: \"primary\" } for set config29001",
"code" : 133,
"codeName" : "FailedToSatisfyReadPreference",
"operationTime" : Timestamp(1561034171, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1561034171, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
5、遇到其他shard也是這種情況下,繼續上面1、2、3的方法操作
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30126024/viewspace-2648280/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- MongoDB叢集搭建(包括隱藏節點,仲裁節點)MongoDB
- 【Mongodb】sharding 叢集Add/Remove 節點MongoDBREM
- 從庫轉換成PXC叢集的節點
- RAC第一個節點被剔除叢集故障分析
- Centos7.9 部署mongodb高可用叢集 3節點CentOSMongoDB
- galera mysql cluster 故障節點再次接入叢集遇到的問題.MySql
- kubernets叢集節點NotReady故障 分析報告
- MongoDB 副本集切換方法MongoDB
- MongoDB分片叢集節點狀態stateStr:RECOVERING解決MongoDB
- Redis叢集的主從切換研究Redis
- MongoDB日常運維-05副本集故障切換MongoDB運維
- mongodb叢集shard_replica的搭建方法MongoDB
- consul 多節點/單節點叢集搭建
- 4.2 叢集節點初步搭建
- weblogic手工建立簡單域的方法(包含節點,叢集)Web
- 單節點DG的switchover切換介紹
- K8s 叢集高可用 master 節點故障如何恢復? 原創K8SAST
- HAC叢集更改IP(單節點更改、全部節點更改)
- 400+節點的 Elasticsearch 叢集運維Elasticsearch運維
- 400+ 節點的 Elasticsearch 叢集運維Elasticsearch運維
- MongoDB叢集同步MongoDB
- Oracle叢集軟體管理-新增和刪除叢集節點Oracle
- linux搭建kafka叢集,多master節點叢集說明LinuxKafkaAST
- linux-HA 系統的故障切換過程細節。Linux
- 高可用的MongoDB叢集MongoDB
- kubeconfig 多個叢集配置 如何切換
- 單節點DG的failover切換介紹AI
- Redis服務之叢集節點管理Redis
- Redis Manager 叢集管理與節點管理Redis
- redhat安裝雙節點cassandra叢集Redhat
- Jedis操作單節點redis,叢集及redisTemplate操作redis叢集(一)Redis
- 在多節點的叢集上執行Cassandra
- mongodb 切換wiredtigerMongoDB
- mongodb副本叢集和分片叢集佈署MongoDB
- MongoDB 分片叢集搭建MongoDB
- Docker 搭建叢集 MongoDBDockerMongoDB
- Mysql叢集/solr/mongoDBMySqlSolrMongoDB
- MongoDB叢集之分片MongoDB