mongodb資料庫範圍分片資料分佈不均勻

zetan·chen發表於2024-07-30

【說明】

當前使用mongodb分片,三個分片

mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("66a30ccca62de41d6b0241a4")
  }
  shards:
        {  "_id" : "mongo1",  "host" : "mongo1/mongo1:2700,mongo2:2700,mongo3:2700",  "state" : 1 }
        {  "_id" : "mongo2",  "host" : "mongo2/mongo1:2701,mongo2:2701,mongo3:2701",  "state" : 1 }
        {  "_id" : "mongo3",  "host" : "mongo3/mongo1:2702,mongo2:2702,mongo3:2702",  "state" : 1 }
  active mongoses:
        "4.2.8" : 3
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours: 
                690 : Success

【表範圍分片測試】

sh.enableSharding("test");
sh.shardCollection("test.messages", { createTime : 1} ); 注意這裡只是測試時候將時間欄位作為範圍分片鍵

使用js插入資料

cat shard_test_messages.js 
testdb = db.getSiblingDB('test');

var messages = ["Hello there", "Good Morning", "valar morghulis"];
var createTime = new Date();
for (var j = 0; j < 50000; j ++) {
  createTime.setFullYear(2024);
  createTime.setMonth(Math.floor(Math.random() * 12));
  createTime.setDate(Math.floor(Math.random() * 31) + 1);
  createTime.setHours(Math.floor(Math.random() * 24));
  createTime.setMinutes(Math.floor(Math.random() * 60));
  createTime.setSeconds(Math.floor(Math.random() * 60));
  testdb.messages.insertOne({
    userid: Math.floor(Math.random()*50000),
    message: messages[Math.floor(Math.random()*messages.length)],
    createTime: createTime
  })
}

db.messages.ensureIndex({createTime: 1});

mongo "localhost:27017/admin" /tmp/shard_test_messages.js -u admin -p 123456

【檢視錶分片情況】

mongos> db.messages.getShardDistribution()

Shard mongo1 at mongo1/mongo1:2700,mongo2:2700,mongo3:2700
 data : 2.01MiB docs : 25000 chunks : 1
 estimated data per chunk : 2.01MiB
 estimated docs per chunk : 25000

Totals
 data : 2.01MiB docs : 25000 chunks : 1
 Shard mongo1 contains 100% data, 100% docs in cluster, avg obj size on shard : 84B


mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("66a30ccca62de41d6b0241a4")
  }
  shards:
        {  "_id" : "mongo1",  "host" : "mongo1/mongo1:2700,mongo2:2700,mongo3:2700",  "state" : 1 }
        {  "_id" : "mongo2",  "host" : "mongo2/mongo1:2701,mongo2:2701,mongo3:2701",  "state" : 1 }
        {  "_id" : "mongo3",  "host" : "mongo3/mongo1:2702,mongo2:2702,mongo3:2702",  "state" : 1 }
  active mongoses:
        "4.2.8" : 3
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours: 
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                mongo1  1
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : mongo1 Timestamp(1, 0) 
        {  "_id" : "test",  "primary" : "mongo1",  "partitioned" : true,  "version" : {  "uuid" : UUID("35a9a3e5-3ba5-4315-977c-9c7176d891ae"),  "lastMod" : 1 } }
                test.messages
                        shard key: { "createTime" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                mongo1  1
                        { "createTime" : { "$minKey" : 1 } } -->> { "createTime" : { "$maxKey" : 1 } } on : mongo1 Timestamp(1, 0) 

mongodb的sharding.autosplit是預設開啟的,並且跟checkSize引數關聯,預設為64M,當前安裝的時候執行了修改這個引數為1M便於方便測試,之前版本的是在mongos的配置檔案引數中設定,

當前版本跟新版本引數在config庫下面的setting表,使用命令修改,各版本的修改命令有差異,需要注意:https://www.mongodb.com/zh-cn/docs/manual/tutorial/modify-chunk-size-in-sharded-cluster/#std-label-tutorial-modifying-range-size;

根據淘寶月報的mongodb分片知識,檢視連結:https://www.bookstack.cn/read/aliyun-rds-core/5717773a4eef2615.md

檢視當前範圍分片的資料只在分片shard1的primary shard中,根據chunk觸發遷移條件,手工測試balance,檢視到報錯引數報錯資訊:

mongos> sh.startBalancer()
2024-07-30T10:39:49.588+0800 E  QUERY    [js] uncaught exception: Error: command failed: {
        "ok" : 0,
        "errmsg" : "Failed to refresh the chunk sizes settings :: caused by :: Expected field \"value\" to have numeric type, but found string",
        "code" : 14,
        "codeName" : "TypeMismatch",
        "operationTime" : Timestamp(1722307189, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1722307189, 1),
                "signature" : {
                        "hash" : BinData(0,"XinbmSBnIvaxzrWfcNLn8IsFI78="),
                        "keyId" : NumberLong("7395769083385348113")
                }
        }
} :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:18:14
_assertCommandWorked@src/mongo/shell/assert.js:583:17
assert.commandWorked@src/mongo/shell/assert.js:673:16
sh.startBalancer@src/mongo/shell/utils_sh.js:184:12
@(shell):1:1

根據報錯檢視是引數值型別不對,檢視配置資訊,確實是字串型別:

mongos> configdb.settings.find()
{ "_id" : "chunksize", "value" : "1" }

當前安裝的分片叢集是使用ansible指令碼安裝,使用變數方式傳入導致引數異常:

檢視引數是數字型別

# The chunksize for shards in MB
mongos_chunk_size: 1

檢視引數輸入時候加了''號為string型別:

configdb = db.getSiblingDB('config');
configdb.settings.save( { _id:"chunksize", value: '{{ mongos_chunk_size }}' } )

檢視引數跟報錯一致,修改引數,測試範圍分片資料正常

mongos> configdb.settings.save( { _id:"chunksize", value: 1 } )
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
mongos> configdb.settings.find()
{ "_id" : "chunksize", "value" : 1 }
mongos>


mongos> sh.startBalancer() { "ok" : 1, "operationTime" : Timestamp(1722307355, 23), "$clusterTime" : { "clusterTime" : Timestamp(1722307355, 23), "signature" : { "hash" : BinData(0,"SfLbKefZpUbx4FCKVT1JeRvsrOY="), "keyId" : NumberLong("7395769083385348113") } } } mongos> mongos> sh.getBalancerState() true mongos> db.messages.getShardDistribution() Shard mongo1 at mongo1/mongo1:2700,mongo2:2700,mongo3:2700 data : 3.37MiB docs : 41863 chunks : 5 estimated data per chunk : 692KiB estimated docs per chunk : 8372 Shard mongo2 at mongo2/mongo1:2701,mongo2:2701,mongo3:2701 data : 3.34MiB docs : 41482 chunks : 5 estimated data per chunk : 685KiB estimated docs per chunk : 8296 Shard mongo3 at mongo3/mongo1:2702,mongo2:2702,mongo3:2702 data : 3.36MiB docs : 41655 chunks : 5 estimated data per chunk : 688KiB estimated docs per chunk : 8331 Totals data : 10.09MiB docs : 125000 chunks : 15 Shard mongo1 contains 33.49% data, 33.49% docs in cluster, avg obj size on shard : 84B Shard mongo2 contains 33.18% data, 33.18% docs in cluster, avg obj size on shard : 84B Shard mongo3 contains 33.32% data, 33.32% docs in cluster, avg obj size on shard : 84B

相關文章