MongoDB Sharding(二) -- 搭建分片叢集

gegeman發表於2021-01-16

上一篇文章中,我們基本瞭解了分片的概念,本文將著手實踐,進行分片叢集的搭建

 

首先我們再來了解一下分片叢集的架構,分片叢集由三部分構成:

  • mongos:查詢路由,在客戶端程式和分片之間提供介面。本次實驗部署2個mongos例項
  • config:配置伺服器儲存叢集的後設資料,後設資料反映分片叢集的內所有資料和元件的狀態和組織方式,後設資料包含每個分片上的塊列表以及定義塊的範圍。從3.4版本開始,已棄用映象伺服器用作配置伺服器(SCCC),config Server必須部署為副本集架構(CSRS)。本次實驗配置一個3節點的副本集作為配置伺服器
  • shard:每個shard包含集合的一部分資料,從3.6版本開始,每個shard必須部署為副本集(replica set)架構。本次實驗部署3個分片儲存資料。

 

(一)主機資訊

 

(二)配置伺服器副本集搭建

配置伺服器三個例項的基礎規劃如下:

member0 192.168.10.80:27017
member1 192.168.10.80:27018
member2 192.168.10.80:27019

其引數規劃如下:

 

 接下來,我們一步一步搭建config server的副本集。

STEP1:解壓mongodb安裝包到/mongo目錄

[root@mongosserver mongo]# pwd
/mongo
[root@mongosserver mongo]# ls
bin LICENSE-Community.txt MPL-2 README THIRD-PARTY-NOTICES THIRD-PARTY-NOTICES.gotools

 STEP2:根據上面引數規劃,建立資料存放相關路徑

# 建立檔案路徑
mkdir -p /replset/repset1/data
mkdir -p /replset/repset1/log
mkdir -p /replset/repset2/data
mkdir -p /replset/repset2/log
mkdir -p /replset/repset3/data
mkdir -p /replset/repset3/log

[root@mongosserver repset1]# tree /replset/
/replset/
├── repset1
│   ├── data
│   ├── log
│   └── mongodb.conf
├── repset2
│   ├── data
│   └── log
└── repset3
    ├── data
    └── log

 STEP3:為3個例項建立引數檔案

例項1的引數檔案  /replset/repset1/mongodb.conf  :

MongoDB Sharding(二) -- 搭建分片叢集
systemLog:
   destination: file
   logAppend: true
   path: /replset/repset1/log/mongodb.log

storage:
   dbPath: /replset/repset1/data
   journal:
     enabled: true

processManagement:
   fork: true  # fork and run in background
   pidFilePath: /replset/repset1/mongod.pid  # location of pidfile
   timeZoneInfo: /usr/share/zoneinfo

# network interfaces
net:
   port: 27017
   bindIp: 0.0.0.0  

# shard
sharding:
  clusterRole: configsvr
  
# repliuca set
replication:
  replSetName: conf
View Code

 例項2的引數檔案  /replset/repset2/mongodb.conf :

MongoDB Sharding(二) -- 搭建分片叢集
systemLog:
   destination: file
   logAppend: true
   path: /replset/repset2/log/mongodb.log

storage:
   dbPath: /replset/repset2/data
   journal:
     enabled: true

processManagement:
   fork: true  # fork and run in background
   pidFilePath: /replset/repset2/mongod.pid  # location of pidfile
   timeZoneInfo: /usr/share/zoneinfo

# network interfaces
net:
   port: 27018
   bindIp: 0.0.0.0  

# shard
sharding:
  clusterRole: configsvr
  
# repliuca set
replication:
  replSetName: conf
 
View Code

 例項3的引數檔案  /replset/repset3/mongodb.conf :

MongoDB Sharding(二) -- 搭建分片叢集
systemLog:
   destination: file
   logAppend: true
   path: /replset/repset3/log/mongodb.log

storage:
   dbPath: /replset/repset3/data
   journal:
     enabled: true

processManagement:
   fork: true  # fork and run in background
   pidFilePath: /replset/repset3/mongod.pid  # location of pidfile
   timeZoneInfo: /usr/share/zoneinfo

# network interfaces
net:
   port: 27019
   bindIp: 0.0.0.0  

# shard
sharding:
  clusterRole: configsvr
  
# repliuca set
replication:
  replSetName: conf
View Code

 STEP4:啟動三個mongod例項

mongod -f /replset/repset1/mongodb.conf
mongod -f /replset/repset2/mongodb.conf
mongod -f /replset/repset3/mongodb.conf


# 檢視是成功否啟動
[root@mongosserver mongo]# netstat -nltp |grep mongod
tcp 0 0 0.0.0.0:27019 0.0.0.0:* LISTEN 28009/mongod 
tcp 0 0 0.0.0.0:27017 0.0.0.0:* LISTEN 27928/mongod 
tcp 0 0 0.0.0.0:27018 0.0.0.0:* LISTEN 27970/mongod

 STEP5:進入任意一個例項,初始化配置伺服器的副本集

rs.initiate(
  {
    _id: "conf",
    configsvr: true,
    members: [
      { _id : 0, host : "192.168.10.80:27017" },
      { _id : 1, host : "192.168.10.80:27018" },
      { _id : 2, host : "192.168.10.80:27019" }
    ]
  }
)

  STEP6:[可選] 調整節點優先順序,以便於確定主節點

cfg = rs.conf()
cfg.members[0].priority = 3
cfg.members[1].priority = 2
cfg.members[2].priority = 1
rs.reconfig(cfg)

 對於members[n]的定義:n是members陣列中的陣列位置,陣列以0開始,千萬不能將其理解為“members[n]._id”的_id值。

檢視節點優先順序:

conf:PRIMARY> rs.config()

 

(三)分片副本集搭建

分片1副本整合員:
member0 192.168.10.81:27017
member1 192.168.10.81:27018
member2 192.168.10.81:27019

分片2副本整合員:
member0 192.168.10.82:27017
member1 192.168.10.82:27018
member2 192.168.10.82:27019

分片3副本整合員:
member0 192.168.10.83:27017
member1 192.168.10.83:27018
member2 192.168.10.83:27019

 

其引數規劃如下:

 這裡一共有3個分片,每個分片都是3個節點的副本集,副本集的搭建過程與上面config server副本集搭建過程相似,這裡不再重複贅述,唯一不同的是副本集的初始化。shard副本集的初始化與配置副本集初始化過程相比,少了 configsvr: true 的引數配置。

三個shard副本集的初始化:

# shard001
rs.initiate( { _id:
"shard001", members: [ { _id : 0, host : "192.168.10.81:27017" }, { _id : 1, host : "192.168.10.81:27018" }, { _id : 2, host : "192.168.10.81:27019" } ] } )
# shard002
rs.initiate( { _id:
"shard002", members: [ { _id : 0, host : "192.168.10.82:27017" }, { _id : 1, host : "192.168.10.82:27018" }, { _id : 2, host : "192.168.10.82:27019" } ] } ) # shard003 rs.initiate( { _id: "shard003", members: [ { _id : 0, host : "192.168.10.83:27017" }, { _id : 1, host : "192.168.10.83:27018" }, { _id : 2, host : "192.168.10.83:27019" } ] } )

 

(四)配置並啟動mongos

本次試驗在192.168.10.100伺服器上啟動2個mongos程式,分別使用埠27000和28000。

STEP1:配置mongos例項的引數

埠27000引數配置,特別注意,需要先建立涉及到的路徑:

systemLog:
   destination: file
   logAppend: true
   path: /mongo/log/mongos-27000.log
   
processManagement:
   fork: true  # fork and run in background
   pidFilePath: /mongo/mongod-27000.pid  # location of pidfile
   timeZoneInfo: /usr/share/zoneinfo

# network interfaces
net:
   port: 27000
   bindIp: 0.0.0.0  

sharding:
  configDB: conf/192.168.10.80:27017,192.168.10.80:27018,192.168.10.80:27019

 埠28000引數配置,特別注意,需要先建立涉及到的路徑:

systemLog:
   destination: file
   logAppend: true
   path: /mongo/log/mongos-28000.log
   
processManagement:
   fork: true  # fork and run in background
   pidFilePath: /mongo/mongod-28000.pid  # location of pidfile
   timeZoneInfo: /usr/share/zoneinfo

# network interfaces
net:
   port: 28000
   bindIp: 0.0.0.0  

sharding:
  configDB: conf/192.168.10.80:27017,192.168.10.80:27018,192.168.10.80:27019

 STEP2:啟動mongos例項

# 啟動mongos例項
[root@mongosserver mongo]# mongos -f /mongo/mongos-27000.conf 
[root@mongosserver mongo]# mongos -f /mongo/mongos-28000.conf 

# 檢視例項資訊
[root@mongosserver mongo]# netstat -nltp|grep mongos
tcp 0 0 0.0.0.0:27000 0.0.0.0:* LISTEN 2209/mongos 
tcp 0 0 0.0.0.0:28000 0.0.0.0:* LISTEN 2241/mongos

 

(五)新增分片到叢集配置伺服器

STEP1:使用mongo連線到mongos

mongo --host 192.168.10.100 --port 27000
# 或者
mongo --host 192.168.10.100 --port 28000

 STEP2:新增分片到叢集

sh.addShard( "shard001/192.168.10.81:27017,192.168.10.81:27018,192.168.10.81:27019")
sh.addShard( "shard002/192.168.10.82:27017,192.168.10.82:27018,192.168.10.82:27019")
sh.addShard( "shard003/192.168.10.83:27017,192.168.10.83:27018,192.168.10.83:27019")

 STEP3:檢視分片資訊

mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
      "_id" : 1,
      "minCompatibleVersion" : 5,
      "currentVersion" : 6,
      "clusterId" : ObjectId("5ffc0709b040c53d59c15c66")
  }
  shards:
        {  "_id" : "shard001",  "host" : "shard001/192.168.10.81:27017,192.168.10.81:27018,192.168.10.81:27019",  "state" : 1 }
        {  "_id" : "shard002",  "host" : "shard002/192.168.10.82:27017,192.168.10.82:27018,192.168.10.82:27019",  "state" : 1 }
        {  "_id" : "shard003",  "host" : "shard003/192.168.10.83:27017,192.168.10.83:27018,192.168.10.83:27019",  "state" : 1 }
  active mongoses:
        "4.2.10" : 2
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours: 
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }

mongos> 

 

 

(六)啟用分片

(6.1)對資料庫啟用分片
分片是以集合為單位進行的,在對一個集合進行分片之前,需要先對其資料庫啟用分片,對資料庫啟用分片並不會重新分發資料,只是說明該資料庫上的集合可以進行分片操作。

sh.enableSharding("lijiamandb");

 

(6.2)對集合啟用分片

如果集合已經存在資料,必須手動建立在分片鍵上建立索引,然後再對集合進行分片,如果集合為空,MongoDB會在分片的時候自動在分片鍵上建立索引。
mongodb提供了2種策略來對集合進行分片:

  • 雜湊(hash)分片,對單列使用hash索引作為分片鍵
sh.shardCollection("<database>.<collection>",{shard key field : "hashed"}) 
  • 範圍(range)分片,可以使用多個欄位作為分片鍵,並將資料劃分為由分片鍵確定的連續範圍
sh.shardCollection("<database>.<collection>",{<shard key field>:1,...} )

 

例子:對集合user進行hash分片

// 連線到mongos,進入lijiamandb資料庫,對新集合users插入10萬條資料
use lijiamandb

for (i=1;i<100000;i++){
  db.user.insert({
  "id" : i,
  "name" : "name"+i,
  "age" : Math.floor(Math.random()*120),
  "created" : new Date()
  });
}


// 使用mongostat可以看到,所有資料都寫入到了主節點(shard2),每個資料庫的主節點可能不同,可以使用sh.status()檢視。
[root@mongosserver ~]# mongostat --port 27000 5 --discover
           host insert query update delete getmore command dirty used flushes mapped vsize   res faults qrw arw net_in net_out conn set repl                time
localhost:27000    352    *0     *0     *0       0   704|0                  0     0B  356M 32.0M      0 0|0 0|0   224k    140k   10      RTR Jan 15 10:52:32.046

               host insert query update delete getmore command dirty  used flushes mapped vsize   res faults qrw arw net_in net_out conn      set repl                time
192.168.10.81:27017     *0    *0     *0     *0       0     2|0  0.3%  0.8%       0        1.90G  133M    n/a 0|0 1|0   417b   9.67k   23 shard001  SEC Jan 15 10:52:32.061
192.168.10.81:27018     *0    *0     *0     *0       0     3|0  0.3%  0.8%       1        1.93G  132M    n/a 0|0 1|0  1.39k   11.0k   28 shard001  PRI Jan 15 10:52:32.067
192.168.10.81:27019     *0    *0     *0     *0       0     2|0  0.3%  0.8%       0        1.95G  148M    n/a 0|0 1|0   942b   10.2k   26 shard001  SEC Jan 15 10:52:32.070
192.168.10.82:27017    352    *0     *0     *0     407  1192|0  2.5% 11.7%       1        1.99G  180M    n/a 0|0 1|0  1.52m   1.15m   29 shard002  PRI Jan 15 10:52:32.075
192.168.10.82:27018   *352    *0     *0     *0     409   441|0  4.5%  8.9%       0        1.96G  163M    n/a 0|0 1|0   566k    650k   25 shard002  SEC Jan 15 10:52:32.085
192.168.10.82:27019   *352    *0     *0     *0       0     2|0  4.4%  9.7%       0        1.92G  168M    n/a 0|0 1|0   406b   9.51k   24 shard002  SEC Jan 15 10:52:32.093
192.168.10.83:27017     *0    *0     *0     *0       0     1|0  0.2%  0.6%       1        1.89G  130M    n/a 0|0 1|0   342b   9.17k   22 shard003  SEC Jan 15 10:52:32.099
192.168.10.83:27018     *0    *0     *0     *0       0     2|0  0.2%  0.6%       0        1.95G  139M    n/a 0|0 1|0   877b   9.92k   28 shard003  PRI Jan 15 10:52:32.107
192.168.10.83:27019     *0    *0     *0     *0       0     1|0  0.2%  0.6%       0        1.90G  133M    n/a 0|0 1|0   342b   9.17k   21 shard003  SEC Jan 15 10:52:32.113
    localhost:27000    365    *0     *0     *0       0   731|0                   0     0B  356M 32.0M      0 0|0 0|0   233k    145k   10           RTR Jan 15 10:52:37.047



// 使用分片鍵id建立hash分片,因為id上沒有hash索引,會報錯
sh.shardCollection("lijiamandb.user",{"id":"hashed"})
/* 1 */
{
    "ok" : 0.0,
    "errmsg" : "Please create an index that starts with the proposed shard key before sharding the collection",
    "code" : 72,
    "codeName" : "InvalidOptions",
    "operationTime" : Timestamp(1610679762, 4),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1610679762, 4),
        "signature" : {
            "hash" : { "$binary" : "AAAAAAAAAAAAAAAAAAAAAAAAAAA=", "$type" : "00" },
            "keyId" : NumberLong(0)
        }
    }
}

// 需要手動建立hash索引
db.user.ensureIndex()

// 檢視索引
/* 1 */
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "lijiamandb.user"
    },
    {
        "v" : 2,
        "key" : {
            "id" : "hashed"
        },
        "name" : "id_hashed",
        "ns" : "lijiamandb.user"
    }
]

# 最後再重新分片即可
sh.shardCollection("lijiamandb".user,{"id":"hashed"})

 

到這裡,我們分片叢集環境已經搭建完成,接下來我們將會學習分片鍵的選擇機制。

 

【完】

 

 

相關文件合集:

1. MongoDB Sharding(一) -- 分片的概念
2. MongoDB Sharding(二) -- 搭建分片叢集
3. MongoDB Sharding(三) -- zone
4. MongoDB Sharding(四) -- 分片叢集的維護管理

相關文章