mongodb之shard分片

馬昌偉發表於2022-03-05

 

總的

 

 

1:在3臺獨立伺服器上,分別執行 27017,27018,27019例項, 互為副本集,形成3套repl set
2: 在3臺伺服器上,各配置config server, 執行27020埠上

3: 配置mongos
./bin/mongos --port 30000 \
 --dbconfig 192.168.1.201:27020,192.168.1.202:27020,192.168.1.203:27020

4:連線路由器
./bin/mongo --port 30000

5: 新增repl set為片
>sh.addShard(‘192.168.1.201:27017’);
>sh.addShard(‘192.168.1.203:27017’);
>sh.addShard(‘192.168.1.203:27017’);


6: 新增待分片的庫
>sh.enableSharding(databaseName);

7: 新增待分片的表
>sh.shardCollection(‘dbName.collectionName’,{field:1});

Field是collection的一個欄位,系統將會利用filed的值,來計算應該分到哪一個片上.
這個filed叫”片鍵”, shard key


mongodb不是從單篇文件的級別,絕對平均的散落在各個片上, 

而是N篇文件,形成一個塊"chunk",
優先放在某個片上, 
當這片上的chunk,比另一個片的chunk,區別比較大時, (>=3) ,會把本片上的chunk,移到另一個片上, 以chunk為單位,
維護片之間的資料均衡

問: 為什麼插入了10萬條資料,才2個chunk?
答: 說明chunk比較大(預設是64M)
在config資料庫中,修改chunksize的值.

問: 既然優先往某個片上插入,當chunk失衡時,再移動chunk,
自然,隨著資料的增多,shard的例項之間,有chunk來回移動的現象,這將帶來什麼問題?
答: 伺服器之間IO的增加, 

接上問: 能否我定義一個規則, 某N條資料形成1個塊,預告分配M個chunk,
M個chunk預告分配在不同片上. 
以後的資料直接入各自預分配好的chunk,不再來回移動?

答: 能, 手動預先分片!

以shop.user表為例
1: sh.shardCollection(‘shop.user’,{userid:1}); //user表用userid做shard key

2: for(var i=1;i<=40;i++) { sh.splitAt('shop.user',{userid:i*1000}) } // 預先在1K 2K...40K這樣的界限切好chunk(雖然chunk是空的), 這些chunk將會均勻移動到各片上.

3: 通過mongos新增user資料. 資料會新增到預先分配好的chunk上, chunk就不會來回移動了.

分片

部署使用分片的mongodb叢集

var rsconf = {
    _id:'rs2',
    members:
      [
        {_id:0,
        host:'10.0.0.11:27017'
        },
        {_id:1,
        host:'10.0.0.11:27018'
        },
        {_id:2,
        host:'10.0.0.11:27020'
        }
      ]    
}



mongod --dbpath /mongodb/m17 --logpath /mongodb/mlog/m17.log --fork --port 27017 --smallfiles
mongod --dbpath /mongodb/m18 --logpath /mongodb/mlog/m18.log --fork --port 27018 --smallfiles
mongod --dbpath /mongodb/m20 --logpath /mongodb/mlog/m20.log --fork --port 27020 --configsvr
需要有一個配置資料庫服務,儲存後設資料用,使用引數--configsvr

mongos --logpath /mongodb/mlog/m30.log --port 30000  --configdb 10.0.0.11:27020  --fork
mongos需要指定配置資料庫,

[mongod@mcw01 ~]$ mongod --dbpath /mongodb/m17 --logpath /mongodb/mlog/m17.log --fork --port 27017 --smallfiles
about to fork child process, waiting until server is ready for connections.
forked process: 18608
child process started successfully, parent exiting
[mongod@mcw01 ~]$ mongod --dbpath /mongodb/m18 --logpath /mongodb/mlog/m18.log --fork --port 27018 --smallfiles
about to fork child process, waiting until server is ready for connections.
forked process: 18627
child process started successfully, parent exiting
[mongod@mcw01 ~]$ mongod --dbpath /mongodb/m20 --logpath /mongodb/mlog/m20.log --fork --port 27020 --configsvr
about to fork child process, waiting until server is ready for connections.
forked process: 18646
child process started successfully, parent exiting
[mongod@mcw01 ~]$ mongos --logpath /mongodb/mlog/m30.log --port 30000  --configdb 10.0.0.11:27020  --fork
2022-03-05T00:26:41.452+0800 W SHARDING [main] Running a sharded cluster with fewer than 3 config servers should only be done for testing purposes and is not recommended for production.
about to fork child process, waiting until server is ready for connections.
forked process: 18667
child process started successfully, parent exiting
[mongod@mcw01 ~]$ ps -ef|grep -v grep |grep mongo
root      16595  16566  0 Mar04 pts/0    00:00:00 su - mongod
mongod    16596  16595  0 Mar04 pts/0    00:00:03 -bash
root      17669  17593  0 Mar04 pts/1    00:00:00 su - mongod
mongod    17670  17669  0 Mar04 pts/1    00:00:00 -bash
root      17735  17715  0 Mar04 pts/2    00:00:00 su - mongod
mongod    17736  17735  0 Mar04 pts/2    00:00:00 -bash
mongod    18608      1  0 00:26 ?        00:00:03 mongod --dbpath /mongodb/m17 --logpath /mongodb/mlog/m17.log --fork --port 27017 --smallfiles
mongod    18627      1  0 00:26 ?        00:00:03 mongod --dbpath /mongodb/m18 --logpath /mongodb/mlog/m18.log --fork --port 27018 --smallfiles
mongod    18646      1  0 00:26 ?        00:00:04 mongod --dbpath /mongodb/m20 --logpath /mongodb/mlog/m20.log --fork --port 27020 --configsvr
mongod    18667      1  0 00:26 ?        00:00:01 mongos --logpath /mongodb/mlog/m30.log --port 30000 --configdb 10.0.0.11:27020 --fork
mongod    18698  16596  0 00:36 pts/0    00:00:00 ps -ef
[mongod@mcw01 ~]$ 


現在configsvr和mongos綁在一塊了,但是和後面的兩個mongodb分片還沒有關係。
下面需要連線mongos,給它增加兩個shard(片節點)。

[mongod@mcw01 ~]$ mongo --port 30000
MongoDB shell version: 3.2.8
connecting to: 127.0.0.1:30000/test
mongos> show dbs;
config  0.000GB
mongos> use config;
switched to db config
mongos> show tables; #檢視mongos中有的表
chunks
lockpings
locks
mongos
settings
shards
tags
version
mongos> 
mongos> 
bye
[mongod@mcw01 ~]$ mongo --port 30000
MongoDB shell version: 3.2.8
connecting to: 127.0.0.1:30000/test
mongos> sh.addShard('10.0.0.11:27017');  #新增兩個shard
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> sh.addShard('10.0.0.11:27018');
{ "shardAdded" : "shard0001", "ok" : 1 }
mongos> sh.status()  #檢視shard狀況
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }  #可以看到有兩個shard,
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" } #這兩個片已經加到configsvr裡了
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        No recent migrations
  databases:

mongos> 
mongos> use test
switched to db test
mongos> db.stu.insert({name:'poly'});  #現在在mongos上建立四條資料。可以查詢到
WriteResult({ "nInserted" : 1 })
mongos> db.stu.insert({name:'lily'});
WriteResult({ "nInserted" : 1 })
mongos> db.stu.insert({name:'hmm'});
WriteResult({ "nInserted" : 1 })
mongos> db.stu.insert({name:'lucy'});
WriteResult({ "nInserted" : 1 })
mongos> db.stu.find();
{ "_id" : ObjectId("6222427bc425e356ae71d452"), "name" : "poly" }
{ "_id" : ObjectId("62224282c425e356ae71d453"), "name" : "lily" }
{ "_id" : ObjectId("62224287c425e356ae71d454"), "name" : "hmm" }
{ "_id" : ObjectId("6222428dc425e356ae71d455"), "name" : "lucy" }
mongos> 



此時我在27017上能看到
[mongod@mcw01 ~]$ mongo --port 27017
.......
> show dbs;
local  0.000GB
test   0.000GB
> use test;
switched to db test
> db.stu.find();
{ "_id" : ObjectId("6222427bc425e356ae71d452"), "name" : "poly" }
{ "_id" : ObjectId("62224282c425e356ae71d453"), "name" : "lily" }
{ "_id" : ObjectId("62224287c425e356ae71d454"), "name" : "hmm" }
{ "_id" : ObjectId("6222428dc425e356ae71d455"), "name" : "lucy" }
> 


但是在27018上查不到資料
[mongod@mcw01 ~]$ mongo --port 27018
.......
> show dbs;
local  0.000GB
> 

沒有設定資料的分片規則。下面我們進入mongos檢視分片情況
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        No recent migrations
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }

mongos>  #如上可以看到,test庫分割槽(partitioned)是false,沒有分片,就預設首選放到主上的分片shard0000

給庫開啟分片

如下:shop是不存在的庫。給shop開啟分片。可看到是true了,且優先放到shard0001上,但是這還不完善
mongos> show dbs;
config  0.000GB
test    0.000GB
mongos> sh.enable
sh.enableBalancing(  sh.enableSharding(
mongos> sh.enableSharding('shop');
{ "ok" : 1 }
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        No recent migrations
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }

mongos> 

指定db下那個表(集合)做分片,指定分片依據那個欄位

mongos> sh.shardCollection('shop.goods',{goods_id:1});
{ "collectionsharded" : "shop.goods", "ok" : 1 }
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        No recent migrations
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods   #shop庫下的goods需要分片。
            shard key: { "goods_id" : 1 } #分片鍵是這個欄位
            unique: false
            balancing: true
            chunks:  
                shard0001    1   #chunk優先放到shard0001分片上
            { "goods_id" : { "$minKey" : 1 } } -->> { "goods_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 0) 

mongos> 

如下插入多條資料,可以發現基本都分配到shard1上了,此時是預設chunk,是很大的

for(var i=1;i<=10000;i++){ db.goods.insert({goods_id:i,goods_name:'mcw mfsfowfofsfewfwifonwainfsfffsf'}) };

[mongod@mcw01 ~]$ mongo --port 30000
MongoDB shell version: 3.2.8
connecting to: 127.0.0.1:30000/test
mongos> use shop;
switched to db shop
mongos> for(var i=1;i<=10000;i++){ db.goods.insert({goods_id:i,goods_name:'mcw mfsfowfofsfewfwifonwainfsfffsf'}) };
WriteResult({ "nInserted" : 1 })
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        1 : Success
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    1
                shard0001    2
            { "goods_id" : { "$minKey" : 1 } } -->> { "goods_id" : 2 } on : shard0000 Timestamp(2, 0) 
            { "goods_id" : 2 } -->> { "goods_id" : 12 } on : shard0001 Timestamp(2, 1) 
            { "goods_id" : 12 } -->> { "goods_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 3) 

mongos> 
mongos> db.goods.find().count();
10000
mongos> 


在27017上可以看到一條資料
[mongod@mcw01 ~]$ mongo --port 27017
> use shop
switched to db shop
> show tables;
goods
> db.goods.find().count();
1
> db.goods.find();
{ "_id" : ObjectId("622252d8b541d8768347746e"), "goods_id" : 1, "goods_name" : "mcw mfsfowfofsfewfwifonwainfsfffsf" }
> 


在27018分片2上有很多,分片分的不均。符合上面顯示的id是2以上的,都在shard1上
[mongod@mcw01 ~]$ mongo --port 27018
> use shop;
switched to db shop
> db.goods.find().count();
9999
> db.goods.find().skip(9996);
{ "_id" : ObjectId("622252e3b541d87683479b7b"), "goods_id" : 9998, "goods_name" : "mcw mfsfowfofsfewfwifonwainfsfffsf" }
{ "_id" : ObjectId("622252e3b541d87683479b7c"), "goods_id" : 9999, "goods_name" : "mcw mfsfowfofsfewfwifonwainfsfffsf" }
{ "_id" : ObjectId("622252e3b541d87683479b7d"), "goods_id" : 10000, "goods_name" : "mcw mfsfowfofsfewfwifonwainfsfffsf" }
> 

修改chunk大小的配置

mongos> use config; #在mongos上,需要切到config庫
switched to db config
mongos> show tables;
changelog
chunks
collections
databases
lockpings
locks
mongos
settings
shards
tags
version
mongos> db.settings.find();  #chunk大小的設定在設定裡面,預設大小是64M
{ "_id" : "chunksize", "value" : NumberLong(64) }
mongos> db.settings.find();
{ "_id" : "chunksize", "value" : NumberLong(64) }
mongos>  db.settings.save({_id:'chunksize'},{$set:{value: 1}});  #這裡不能用update的方式修改
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
mongos> db.settings.find();
{ "_id" : "chunksize" }
mongos> db.settings.save({ "_id" : "chunksize", "value" : NumberLong(64) });
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
mongos> db.settings.save({ "_id" : "chunksize", "value" : NumberLong(1) });  #修改chunk大小的配置為1M
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
mongos> db.settings.find();  #檢視修改成功
{ "_id" : "chunksize", "value" : NumberLong(1) }
mongos> 

下面我們插入15萬行資料,檢視分片規則下的分片情況

for(var i=1;i<=150000;i++){ db.goods.insert({goods_id:i,goods_name:'mcw mfsfowfofsfewfwifonwainfsfffsf fsff'}) };



之前的表被刪了,分片規則肯定也被刪除了,重新建立分片規則吧
mongos> use shop;
switched to db shop
mongos> db.goods.drop();
false
mongos> for(var i=1;i<=150000;i++){ db.goods.insert({goods_id:i,goods_name:'mcw mfsfowfofsfewfwifonwainfsfffsf fsff'}) };
WriteResult({ "nInserted" : 1 })
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        1 : Success
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }

mongos> sh.shardCollection('shop.goods',{goods_id:1});
{
    "proposedKey" : {
        "goods_id" : 1
    },
    "curIndexes" : [
        {
            "v" : 1,
            "key" : {
                "_id" : 1
            },
            "name" : "_id_",
            "ns" : "shop.goods"
        }
    ],
    "ok" : 0,
    "errmsg" : "please create an index that starts with the shard key before sharding."
}
mongos> 



重新建立分片規則,然後新增資料
mongos> db.goods.drop();
true
mongos> sh.shardCollection('shop.goods',{goods_id:1});
{ "collectionsharded" : "shop.goods", "ok" : 1 }
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        1 : Success
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0001    1
            { "goods_id" : { "$minKey" : 1 } } -->> { "goods_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 0) 

重新插入資料,發現一個分片分了7個chunk,一個分片分了20個chunk,有點不均勻。手動預分配的方式更好
mongos> for(var i=1;i<=150000;i++){ db.goods.insert({goods_id:i,goods_name:'mcw mfsfowfofsfewfwifonwainfsfffsf fsff'}) }; 
WriteResult({ "nInserted" : 1 })
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        8 : Success
        13 : Failed with error 'aborted', from shard0001 to shard0000
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    7
                shard0001    20
            too many chunks to print, use verbose if you want to force print

mongos> 


根據提示,試了幾次,正確顯示出詳細分片情況如下
mongos> sh.status({verbose:1});
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    {  "_id" : "mcw01:30000",  "ping" : ISODate("2022-03-04T18:32:36.132Z"),  "up" : NumberLong(7555),  "waiting" : true,  "mongoVersion" : "3.2.8" }
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        8 : Success
        29 : Failed with error 'aborted', from shard0001 to shard0000
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    7
                shard0001    20
            { "goods_id" : { "$minKey" : 1 } } -->> { "goods_id" : 2 } on : shard0000 Timestamp(8, 1) 
            { "goods_id" : 2 } -->> { "goods_id" : 12 } on : shard0001 Timestamp(7, 1) 
            { "goods_id" : 12 } -->> { "goods_id" : 5473 } on : shard0001 Timestamp(2, 2) 
            { "goods_id" : 5473 } -->> { "goods_id" : 12733 } on : shard0001 Timestamp(2, 3) 
            { "goods_id" : 12733 } -->> { "goods_id" : 18194 } on : shard0000 Timestamp(3, 2) 
            { "goods_id" : 18194 } -->> { "goods_id" : 23785 } on : shard0000 Timestamp(3, 3) 
            { "goods_id" : 23785 } -->> { "goods_id" : 29246 } on : shard0001 Timestamp(4, 2) 
            { "goods_id" : 29246 } -->> { "goods_id" : 34731 } on : shard0001 Timestamp(4, 3) 
            { "goods_id" : 34731 } -->> { "goods_id" : 40192 } on : shard0000 Timestamp(5, 2) 
            { "goods_id" : 40192 } -->> { "goods_id" : 45913 } on : shard0000 Timestamp(5, 3) 
            { "goods_id" : 45913 } -->> { "goods_id" : 51374 } on : shard0001 Timestamp(6, 2) 
            { "goods_id" : 51374 } -->> { "goods_id" : 57694 } on : shard0001 Timestamp(6, 3) 
            { "goods_id" : 57694 } -->> { "goods_id" : 63155 } on : shard0000 Timestamp(7, 2) 
            { "goods_id" : 63155 } -->> { "goods_id" : 69367 } on : shard0000 Timestamp(7, 3) 
            { "goods_id" : 69367 } -->> { "goods_id" : 74828 } on : shard0001 Timestamp(8, 2) 
            { "goods_id" : 74828 } -->> { "goods_id" : 81170 } on : shard0001 Timestamp(8, 3) 
            { "goods_id" : 81170 } -->> { "goods_id" : 86631 } on : shard0001 Timestamp(8, 5) 
            { "goods_id" : 86631 } -->> { "goods_id" : 93462 } on : shard0001 Timestamp(8, 6) 
            { "goods_id" : 93462 } -->> { "goods_id" : 98923 } on : shard0001 Timestamp(8, 8) 
            { "goods_id" : 98923 } -->> { "goods_id" : 106012 } on : shard0001 Timestamp(8, 9) 
            { "goods_id" : 106012 } -->> { "goods_id" : 111473 } on : shard0001 Timestamp(8, 11) 
            { "goods_id" : 111473 } -->> { "goods_id" : 118412 } on : shard0001 Timestamp(8, 12) 
            { "goods_id" : 118412 } -->> { "goods_id" : 123873 } on : shard0001 Timestamp(8, 14) 
            { "goods_id" : 123873 } -->> { "goods_id" : 130255 } on : shard0001 Timestamp(8, 15) 
            { "goods_id" : 130255 } -->> { "goods_id" : 135716 } on : shard0001 Timestamp(8, 17) 
            { "goods_id" : 135716 } -->> { "goods_id" : 142058 } on : shard0001 Timestamp(8, 18) 
            { "goods_id" : 142058 } -->> { "goods_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(8, 19) 

mongos> 


過一天後再看,發現兩個分片上的chunk分片的比較均衡了13,14,說明它沒有後臺自動在做均衡,而且不是很快即均衡的,需要時間。
[mongod@mcw01 ~]$ mongo --port 30000
MongoDB shell version: 3.2.8
connecting to: 127.0.0.1:30000/test
mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        14 : Success
        65 : Failed with error 'aborted', from shard0001 to shard0000
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    13
                shard0001    14
            too many chunks to print, use verbose if you want to force print

mongos> 

手動預先分片

分片的命令

replication
英 [ˌreplɪ'keɪʃ(ə)n]   美 [ˌreplɪ'keɪʃ(ə)n]  
n.
(繪畫等的)複製;拷貝;重複(實驗);(尤指對答辯的)回答

for(var i=1;i<=40;i++){sh.splitAt('shop.user',{userid:i*1000})}
給shop這個庫下的user表切割分片,只要userid欄位是1000的倍數就切割一次,形成一個新的chunk。

mongos> sh.help();
    sh.addShard( host )                       server:port OR setname/server:port
    sh.enableSharding(dbname)                 enables sharding on the database dbname
    sh.shardCollection(fullName,key,unique)   shards the collection
    sh.splitFind(fullName,find)               splits the chunk that find is in at the median
    sh.splitAt(fullName,middle)               splits the chunk that middle is in at middle
    sh.moveChunk(fullName,find,to)            move the chunk where 'find' is to 'to' (name of shard)
    sh.setBalancerState( <bool on or not> )   turns the balancer on or off true=on, false=off
    sh.getBalancerState()                     return true if enabled
    sh.isBalancerRunning()                    return true if the balancer has work in progress on any mongos
    sh.disableBalancing(coll)                 disable balancing on one collection
    sh.enableBalancing(coll)                  re-enable balancing on one collection
    sh.addShardTag(shard,tag)                 adds the tag to the shard
    sh.removeShardTag(shard,tag)              removes the tag from the shard
    sh.addTagRange(fullName,min,max,tag)      tags the specified range of the given collection
    sh.removeTagRange(fullName,min,max,tag)   removes the tagged range of the given collection
    sh.status()                               prints a general overview of the cluster
mongos> 

預先分片

給user表做分片,以userid作為片鍵進行分片。
假設預計一年內增長4千萬使用者,這兩個sharding上每個上分2千萬,2千萬又分為20個片,每個片上是一百萬個資料

我們模擬一下一共30-40個片,每個片上1千條資料 。這樣預分片得使用切割的方法
mongos> use shop
switched to db shop
mongos> sh.shardCollection('shop.user',{userid:1})
{ "collectionsharded" : "shop.user", "ok" : 1 }
mongos> #給shop這個庫下的user表切割分片,只要userid欄位是1000的倍數就切割一次,形成一個新的chunk。
mongos> for(var i=1;i<=40;i++){sh.splitAt('shop.user',{userid:i*1000})}
{ "ok" : 1 }
mongos> 

執行預先分片後檢視

檢視user表,當穩定後,20,21就不會因為資料不平衡來回轉移分片中的chunk,導致機器的io很高
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        34 : Success
        65 : Failed with error 'aborted', from shard0001 to shard0000
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    13
                shard0001    14
            too many chunks to print, use verbose if you want to force print
        shop.user
            shard key: { "userid" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    20
                shard0001    21
            too many chunks to print, use verbose if you want to force print

mongos> 

檢視分片的詳情,可以看到是這樣分片的,現在還沒有插入資料,但是已經提前知道大概多少資料,根據usserid已經劃分好userid所在的分片了。這樣當插入資料的時候,它應會儲存到符合條件的範圍chunk上的。
mongos> sh.status({verbose:1});
--- Sharding Status --- 
............. 
        shop.user
            shard key: { "userid" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    20
                shard0001    21
            { "userid" : { "$minKey" : 1 } } -->> { "userid" : 1000 } on : shard0000 Timestamp(2, 0) 
            { "userid" : 1000 } -->> { "userid" : 2000 } on : shard0000 Timestamp(3, 0) 
            { "userid" : 2000 } -->> { "userid" : 3000 } on : shard0000 Timestamp(4, 0) 
            { "userid" : 3000 } -->> { "userid" : 4000 } on : shard0000 Timestamp(5, 0) 
            { "userid" : 4000 } -->> { "userid" : 5000 } on : shard0000 Timestamp(6, 0) 
            { "userid" : 5000 } -->> { "userid" : 6000 } on : shard0000 Timestamp(7, 0) 
            { "userid" : 6000 } -->> { "userid" : 7000 } on : shard0000 Timestamp(8, 0) 
            { "userid" : 7000 } -->> { "userid" : 8000 } on : shard0000 Timestamp(9, 0) 
            { "userid" : 8000 } -->> { "userid" : 9000 } on : shard0000 Timestamp(10, 0) 
            { "userid" : 9000 } -->> { "userid" : 10000 } on : shard0000 Timestamp(11, 0) 
            { "userid" : 10000 } -->> { "userid" : 11000 } on : shard0000 Timestamp(12, 0) 
            { "userid" : 11000 } -->> { "userid" : 12000 } on : shard0000 Timestamp(13, 0) 
            { "userid" : 12000 } -->> { "userid" : 13000 } on : shard0000 Timestamp(14, 0) 
            { "userid" : 13000 } -->> { "userid" : 14000 } on : shard0000 Timestamp(15, 0) 
            { "userid" : 14000 } -->> { "userid" : 15000 } on : shard0000 Timestamp(16, 0) 
            { "userid" : 15000 } -->> { "userid" : 16000 } on : shard0000 Timestamp(17, 0) 
            { "userid" : 16000 } -->> { "userid" : 17000 } on : shard0000 Timestamp(18, 0) 
            { "userid" : 17000 } -->> { "userid" : 18000 } on : shard0000 Timestamp(19, 0) 
            { "userid" : 18000 } -->> { "userid" : 19000 } on : shard0000 Timestamp(20, 0) 
            { "userid" : 19000 } -->> { "userid" : 20000 } on : shard0000 Timestamp(21, 0) 
            { "userid" : 20000 } -->> { "userid" : 21000 } on : shard0001 Timestamp(21, 1) 
            { "userid" : 21000 } -->> { "userid" : 22000 } on : shard0001 Timestamp(1, 43) 
            { "userid" : 22000 } -->> { "userid" : 23000 } on : shard0001 Timestamp(1, 45) 
            { "userid" : 23000 } -->> { "userid" : 24000 } on : shard0001 Timestamp(1, 47) 
            { "userid" : 24000 } -->> { "userid" : 25000 } on : shard0001 Timestamp(1, 49) 
            { "userid" : 25000 } -->> { "userid" : 26000 } on : shard0001 Timestamp(1, 51) 
            { "userid" : 26000 } -->> { "userid" : 27000 } on : shard0001 Timestamp(1, 53) 
            { "userid" : 27000 } -->> { "userid" : 28000 } on : shard0001 Timestamp(1, 55) 
            { "userid" : 28000 } -->> { "userid" : 29000 } on : shard0001 Timestamp(1, 57) 
            { "userid" : 29000 } -->> { "userid" : 30000 } on : shard0001 Timestamp(1, 59) 
            { "userid" : 30000 } -->> { "userid" : 31000 } on : shard0001 Timestamp(1, 61) 
            { "userid" : 31000 } -->> { "userid" : 32000 } on : shard0001 Timestamp(1, 63) 
            { "userid" : 32000 } -->> { "userid" : 33000 } on : shard0001 Timestamp(1, 65) 
            { "userid" : 33000 } -->> { "userid" : 34000 } on : shard0001 Timestamp(1, 67) 
            { "userid" : 34000 } -->> { "userid" : 35000 } on : shard0001 Timestamp(1, 69) 
            { "userid" : 35000 } -->> { "userid" : 36000 } on : shard0001 Timestamp(1, 71) 
            { "userid" : 36000 } -->> { "userid" : 37000 } on : shard0001 Timestamp(1, 73) 
            { "userid" : 37000 } -->> { "userid" : 38000 } on : shard0001 Timestamp(1, 75) 
            { "userid" : 38000 } -->> { "userid" : 39000 } on : shard0001 Timestamp(1, 77) 
            { "userid" : 39000 } -->> { "userid" : 40000 } on : shard0001 Timestamp(1, 79) 
            { "userid" : 40000 } -->> { "userid" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 80) 

mongos> 

插入資料,檢視手動預分片的效果,防止資料在節點間來回複製

當chunk快滿的時候,一定要提前解決,不然新增新的分片,導致資料大量的移動,io太高而發生伺服器掛掉的情況

在mongos上插入資料
mongos> for(var i=1;i<=40000;i++){db.user.insert({userid:i,name:'xiao ma guo he'})};
WriteResult({ "nInserted" : 1 })
mongos> 

在27017上可以看到1-19999共19999條資料。根據上面的分片詳情,可以知道shard0000上就是分配了1-20000,而這裡是取前不取後。20000在shard0001上,所以0上有19999條資料,而1上有20000-40000的資料,20000-39999是2萬條資料在1上,加上40000到最大在1上,所以就是20001條資料。預分配分片相對來說比較穩定,這也不會因為當隨著資料插入分配不均衡時,資料在兩個節點之間來回複製帶來的效能問題。
[mongod@mcw01 ~]$ mongo --port 27017
> use shop
switched to db shop
> db.user.find().count();
19999
> db.user.find().skip(19997);
{ "_id" : ObjectId("6222d91f69eed283bf054e96"), "userid" : 19998, "name" : "xiao ma guo he" }
{ "_id" : ObjectId("6222d91f69eed283bf054e97"), "userid" : 19999, "name" : "xiao ma guo he" }
> 


[mongod@mcw01 ~]$ mongo --port 27018
> use shop
switched to db shop
> db.user.find().count();
20001
> db.user.find().skip(19999);
{ "_id" : ObjectId("6222d93669eed283bf059cb7"), "userid" : 39999, "name" : "xiao ma guo he" }
{ "_id" : ObjectId("6222d93669eed283bf059cb8"), "userid" : 40000, "name" : "xiao ma guo he" }
> db.user.find().limit(1);
{ "_id" : ObjectId("6222d91f69eed283bf054e98"), "userid" : 20000, "name" : "xiao ma guo he" }
> 

 

相關文章