Mongo資料遷移實驗

jx_yu發表於2015-04-07

說明

         公司現有線上mongo環境:

1.       不同版本(2.2.02.2.22.4.8tokumx-2.0.0

2.       不同架構環境(單例項、帶master引數的單例項,分片+副本集)

為了將現有複雜的線上環境不影響業務正常執行的條件下,遷移到同一套副本集環境中,下文模擬了不同版本、不同架構的情況下,來進行資料遷移到新的副本集(具體源環境見下節的源端環境)的實驗

難點:由於源端是線上環境(隨時資料會變化),在不影響業務正常使用(儘量不停機或者停機時間很短)的情況下遷移資料到新環境,同時要保證資料一致性、完整性

源端環境

公司目前現有mongo環境,如下

業務

主機

版本

環境

分類

--

--

2.2.2

--master

2

--

--

2.2.0

單例項

1

--

--

2.2.2

分片+副本集

4

--

--

2.4.8

--master

3

--

--

tokumx-2.0.0

5

--

--

2.2.0

單例項

1

--

2.2.0

單例項

1

#如上環境分為5類,下面將模擬出相同版本和相同環境的mongodb例項,進行遷移實驗

目標環境

#3node的副本集

yougou:PRIMARY> rs.status()

{

        "set" : "yougou",

        "date" : ISODate("2015-04-01T17:20:06Z"),

        "myState" : 1,

        "members" : [

                {

                        "_id" : 0,

                        "name" : "192.168.211.250:27017",

                        "health" : 1,

                        "state" : 1,

                        "stateStr" : "PRIMARY",

                        "uptime" : 1061,

                        "optime" : Timestamp(1427908775, 1),

                        "optimeDate" : ISODate("2015-04-01T17:19:35Z"),

                        "self" : true

                },

                {

                        "_id" : 1,

                        "name" : "192.168.211.250:20000",

                        "health" : 1,

                        "state" : 2,

                        "stateStr" : "SECONDARY",

                        "uptime" : 34,

                        "optime" : Timestamp(1427908775, 1),

                        "optimeDate" : ISODate("2015-04-01T17:19:35Z"),

                        "lastHeartbeat" : ISODate("2015-04-01T17:20:04Z"),

                        "lastHeartbeatRecv" : ISODate("2015-04-01T17:20:05Z"),

                        "pingMs" : 0,

                        "syncingTo" : "192.168.211.250:27017"

                },

                {

                        "_id" : 2,

                        "name" : "192.168.211.250:30000",

                        "health" : 1,

                        "state" : 2,

                        "stateStr" : "SECONDARY",

                        "uptime" : 31,

                        "optime" : Timestamp(1427908775, 1),

                        "optimeDate" : ISODate("2015-04-01T17:19:35Z"),

                        "lastHeartbeat" : ISODate("2015-04-01T17:20:05Z"),

                        "lastHeartbeatRecv" : ISODate("2015-04-01T17:20:05Z"),

                        "pingMs" : 0,

                        "syncingTo" : "192.168.211.250:27017"

                }

        ],

        "ok" : 1

}

遷移工具mongosync

mongosync mongodb 資料同步工具,目前已支援絕大多種資料同步。

clip_image002

注意:

1.   mongosync透過mongooplog來實現增量同步,故如果要實現實時同步,則源端必須開啟oplog功能,而oplog只有在mong複製、副本集環境下才會開啟。於是,對於沒有指定—master引數的單例項,需要重啟mongdb,指定—master選項來實現

2.   mongosync使用需要指定使用者密碼,故源端如果沒有存在admin使用者,需要addUser

實驗內容

1.mongo-2.2.0單例項無master選項遷移到副本集

非實時同步-直接遷移

源端環境

/data/mongod/mongodb-2.2.0/bin/mongod --dbpath /data/mongodb/node1 --logpath /data/mongodb/logs/mongodb-2.2.0.log --logappend --fork --auth --port 22000 –directoryperdb

如上,啟動選項看出 master選項

~]# /data/mongod/mongodb-2.2.0/bin/mongo --port 22000

MongoDB shell version: 2.2.0

connecting to: 127.0.0.1:22000/test

> show dbs

admin   (empty)

db220   0.203125GB

local   (empty)          #說明沒有oplog

> use db220

switched to db db220

> show collections

system.indexes

t220

> db.t220.count()

888

目標端同步資料

~]# mongosync -h 192.168.211.217:22000 -u root -p yougou –d db220 --to 192.168.211.250:27017 -tu root -tp yougou --oplog

connected to: 192.168.211.217:22000

Thu Apr  2 01:57:07.632 [mongosync] 192.168.211.217:22000 connected ok

Thu Apr  2 01:57:07.634 [mongosync] admin auth ok

Thu Apr  2 01:57:07.635 [mongosync] 192.168.211.250:27017 connected ok

Thu Apr  2 01:57:07.636 [mongosync] admin auth ok

Thu Apr  2 01:57:07.639 [mongosync] lastOp OpTime:0,0 (Jan  1 08:00:00 0:0)

Thu Apr  2 01:57:07.639 [mongosync] clone all dbs ^_^

Thu Apr  2 01:57:07.641 [mongosync] DATABASE: db220      to     db220

Thu Apr  2 01:57:07.910 [mongosync] cloning db220.t220 -> db220.t220

Thu Apr  2 01:57:07.927 [mongosync]              888 objects

Thu Apr  2 01:57:07.928 [mongosync] cloning db220.system.users -> db220.system.users

Thu Apr  2 01:57:07.930 [mongosync]              0 objects

Thu Apr  2 01:57:07.930 [mongosync] cloning db220.system.indexes -> db220.system.indexes

Thu Apr  2 01:57:07.932 [mongosync]              1 objects

Thu Apr  2 01:57:07.937 [mongosync]              0 objects

Thu Apr  2 01:57:07.940 [mongosync] clone done,start to catch up

Thu Apr  2 01:57:07.941 [mongosync] No new ops,waiting...

Thu Apr  2 01:57:12.942 [mongosync] No new ops,waiting...

……….

登入目標端驗證

yougou:PRIMARY> db.t220.count()

888                          #發現源端的db220同步完成

#源端修改資料(insert

> for(var i=1;i<=888;i++) db.t220.save({"id":i,"a":123456789,"b":888888888,"c":100000000})

> db.t220.count()

1776

#目標端檢視,新修改的資料沒有同步

yougou:PRIMARY> db.t220.count()

888

故不能做到實時同步,為了解決此問題,則源端必須開啟oplog,我們藉助最簡單的方式 指定—master選項啟動

實時資料同步—master

1.重啟源端mongo例項

~]# netstat -ntpl|grep mongo|grep 22000

tcp        0      0 0.0.0.0:22000               0.0.0.0:*                   LISTEN      20696/mongod       

]# kill 20696

--指定maste選項

]# /data/mongod/mongodb-2.2.0/bin/mongod --dbpath /data/mongodb/node1 --logpath /data/mongodb/logs/mongodb-2.2.0.log --logappend --fork --auth --port 22000 --directoryperdb –master    

forked process: 5558

all output going to: /data/mongodb/logs/mongodb-2.2.0.log

child process started successfully, parent exiting

#登入源端,發現oplog已經開啟

> use local

switched to db local

> show collections

oplog.$main

2.目標端執行同步命令

~]# mongosync -h 192.168.211.217:22000 -u root -p yougou -d db220 --to 192.168.211.250:27017  -tu root -tp yougou –oplog

Thu Apr  2 02:12:39.672 [mongosync] 192.168.211.217:22000 connected ok

Thu Apr  2 02:12:39.675 [mongosync] admin auth ok

Thu Apr  2 02:12:39.675 [mongosync] 192.168.211.250:27017 connected ok

Thu Apr  2 02:12:39.677 [mongosync] admin auth ok

Thu Apr  2 02:12:39.681 [mongosync] lastOp OpTime:1427941057,1 (Apr  2 10:17:37 551ca6c1:1)

Thu Apr  2 02:12:39.681 [mongosync] DATABASE: db220      to     db220

Thu Apr  2 02:12:40.005 [mongosync] cloning db220.t220 -> db220.t220

Thu Apr  2 02:12:40.098 [mongosync]              1776 objects

Thu Apr  2 02:12:40.098 [mongosync] cloning db220.system.users -> db220.system.users

Thu Apr  2 02:12:40.102 [mongosync]              0 objects

Thu Apr  2 02:12:40.102 [mongosync] cloning db220.system.indexes -> db220.system.indexes

Thu Apr  2 02:12:40.106 [mongosync]              1 objects

Thu Apr  2 02:12:40.106 [mongosync] sync database: db220

Thu Apr  2 02:12:40.108 [mongosync] begin to apply oplog...

Thu Apr  2 02:12:40.108 [mongosync] db220:0 rows oplog to apply...

。。。。。。。。。。

#此時在源端新增加資料

> for(var i=1;i<=8888;i++) db.t220.save({"id":i,"a":123456789,"b":888888888,"c":100000000})

> db.t220.count()

10664

#目標端同步的日誌可以發現,增量同步了

Thu Apr  2 02:13:06.158 [mongosync] No new oplogs,the data is the newest ^_^.waiting.

Thu Apr  2 02:13:07.175 [mongosync]     Synced up to optime:ts: Timestamp 1427941089000|1

Thu Apr  2 02:13:12.650 [mongosync]              8888 ops

Thu Apr  2 02:13:12.650 [mongosync] db220:8888 rows oplog are applied,             waiting for new data ^_^.The latest optime is ts: Timestamp 1427941089000|8888

Thu Apr  2 02:13:16.403 [mongosync]              8888 ops

Thu Apr  2 02:13:16.403 [mongosync] db220:8888 rows oplog are applied,             waiting for new data ^_^.The latest optime is ts: Timestamp 1427941089000|8888

Thu Apr  2 02:13:21.415 [mongosync]              8888 ops

Thu Apr  2 02:13:21.415 [mongosync] db220:8888 rows oplog are applied,             waiting for new data ^_^.The latest optime is ts: Timestamp 1427941089000|8888

Thu Apr  2 02:13:25.420 [mongosync]              8888 ops

Thu Apr  2 02:13:25.420 [mongosync] db220:8888 rows oplog are applied,             waiting for new data ^_^.The latest optime is ts: Timestamp 1427941089000|8888

Thu Apr  2 02:13:30.432 [mongosync]              8888 ops

Thu Apr  2 02:13:30.432 [mongosync] db220:8888 rows oplog are applied,             waiting for new data ^_^.The latest optime is ts: Timestamp 1427941089000|8888

Thu Apr  2 02:13:34.435 [mongosync]              8888 ops

Thu Apr  2 02:13:34.435 [mongosync] db220:8888 rows oplog are applied,             waiting for new data ^_^.The latest optime is ts: Timestamp 1427941089000|8888

Thu Apr  2 02:13:39.447 [mongosync]              8888 ops

Thu Apr  2 02:13:39.448 [mongosync] db220:8888 rows oplog are applied,             waiting for new data ^_^.The latest optime is ts: Timestamp 1427941089000|8888

Thu Apr  2 02:13:44.297 [mongosync]              8888 ops

Thu Apr  2 02:13:44.297 [mongosync] db220:8888 rows oplog are applied,             waiting for new data ^_^.The latest optime is ts: Timestamp 1427941089000|8888

Thu Apr  2 02:13:49.309 [mongosync]              8888 ops

Thu Apr  2 02:13:49.309 [mongosync] db220:8888 rows oplog are applied,             waiting for new data ^_^.The latest optime is ts: Timestamp 1427941089000|8888

#同時登入目標端,可以看到表的記錄在增加

yougou:PRIMARY> db.t220.count()

1776

yougou:PRIMARY> db.t220.count()

10664

同樣在源端新建立集合也能實時同步,從而做到真正意義的同步

3.主要操作步驟

1.       重啟源端單例項mongodb,啟動的時候指定—master引數

2.       在目標端使用mongosync-d xxx指定的資料庫進行同步,指定—oplog可以做到實時同步

3.       同步跟上後,源端的上層應用可以切換到目標端

4.       待目標端使用確定沒問題後,源端可以銷燬

2.mongo-2.2.2並且有master選項遷移到副本集

直接進行實時同步,同上面1的實時資料同步—master

~]# mongosync -h 192.168.211.217:22200 -u root -p yougou -d db222 --to 192.168.211.250:27017  -tu root -tp yougou --oplog

3.mongo-2.4.8並且有master選項遷移到副本集

同上

~]# mongosync -h 192.168.211.217:24800 -u root -p yougou -d db248 --to 192.168.211.250:27017  -tu root -tp yougou –oplog

4.分片(副本集)同步到副本集

~]# mongosync -h 192.168.211.217:20000 -d yougou --to 192.168.211.250:27017  -tu root -tp yougou --oplog

connected to: 192.168.211.217:20000

Thu Apr  2 05:29:34.491 [mongosync] 192.168.211.217:20000 connected ok

Thu Apr  2 05:29:34.492 [mongosync] 192.168.211.250:27017 connected ok

Thu Apr  2 05:29:34.549 [mongosync] admin auth ok

Thu Apr  2 05:29:34.554 [mongosync] stop balancer

Thu Apr  2 05:29:34.557 [mongosync] target shards are less than source shards.

#由於mongosync暫時不支援分片同步到副本集,需要單獨開多個mongosync分別連到各個分片來同步,有幾個分片開幾個mongosync,並且如果要實現實時同步,就必須用到oplog增量,所以源端的分片必須副本集或者主從(總之oplog開啟)

~]# /data/mongod/mongodb-2.2.2/bin/mongo --port 20000

MongoDB shell version: 2.2.2

connecting to: 127.0.0.1:20000/test

mongos> use config

switched to db config

mongos> db.shards.find()

{ "_id" : "shard1", "host" : "shard1/192.168.211.217:30000,192.168.211.217:40000" }

{ "_id" : "shard2", "host" : "shard2/192.168.211.217:50000,192.168.211.217:60000" }

#如上,有2個分片shard1shard2分別為2節點的副本集

源端模擬實時寫入資料

mongos>for(var i=1;i<=888888;i++) db.yujx.save({"id":i,"a":123456789,"b":888888888,"c":100000000})

mongos>for(var i=1;i<=888888;i++) db.yujx.save({"id":0,"a":123456789,"b":888888888,"c":100000000})

目標端開啟mongosync同步

#同步分片shard1

~]# mongosync -h 192.168.211.217:30000 -d yougou --to 192.168.211.250:27017  -tu root -tp yougou –oplog

#同步分片shard2

~]# mongosync -h 192.168.211.217:50000 -d yougou --to 192.168.211.250:27017  -tu root -tp yougou –oplog

驗證實時同步

#最終,源端查詢yujx集合狀態可知:shard128929 row,shard2357412 row,386341

mongos> db.yujx.stats()

{

        "sharded" : true,

        "ns" : "yougou.yujx",

        "count" : 386341,

        "numExtents" : 13,

        "size" : 26271188,

        "storageSize" : 40591360,

        "totalIndexSize" : 26326720,

        "indexSizes" : {

                "_id_" : 12558336,

                "id_1" : 13768384

        },

        "avgObjSize" : 68,

        "nindexes" : 2,

        "nchunks" : 3,

        "shards" : {

                "shard1" : {

                        "ns" : "yougou.yujx",

                        "count" : 28929,

                        "size" : 1967172,

                        "avgObjSize" : 68,

                        "storageSize" : 2793472,

                        "numExtents" : 5,

                        "nindexes" : 2,

                        "lastExtentSize" : 2097152,

                        "paddingFactor" : 1,

                        "systemFlags" : 1,

                        "userFlags" : 0,

                        "totalIndexSize" : 1766016,

                        "indexSizes" : {

                                "_id_" : 948416,

                                "id_1" : 817600

                        },

                        "ok" : 1

                },

                "shard2" : {

                        "ns" : "yougou.yujx",

                        "count" : 357412,

                        "size" : 24304016,

                        "avgObjSize" : 68,

                        "storageSize" : 37797888,

                        "numExtents" : 8,

                        "indexSizes" : {

                                "_id_" : 11609920,

                                "id_1" : 12950784

                        },

                        "ok" : 1

                }

        },

        "ok" : 1

}

mongos> db.yujx.count()

386341

#到目標端查詢集合yujx的情況

yougou:PRIMARY> db.yujx.count()

386341

yougou:PRIMARY> db.yujx.stats()

{

        "ns" : "yougou.yujx",

        "count" : 386341,

        "size" : 27816584,

        "avgObjSize" : 72.00008282838218,

        "storageSize" : 37797888,

        "systemFlags" : 1,

        "userFlags" : 0,

        "totalIndexSize" : 26948096,

        "indexSizes" : {

                "_id_" : 12541984,

                "id_1" : 14406112

        },

        "ok" : 1

}#如上 源端和目標端的資料室同步的

並且 同步過程中mongosync透過日誌,可以看到實時增量部分的情況,或者,在目標端隔時來執行db.yujx.count()可以看到,資料量一致增加的,從而做到了實時同步

5.tokumx-2.0.0同步到mongo副本集

mongosync的限制上寫了,不適用於TokuMX同步資料到Mongo

clip_image004

故,此種情況為了保證資料完整性,必須給停機或者應用停止修改資料的時間視窗,在此期間匯出資料 並且匯入mongo完成後,應用直接連新的mongo環境

實驗結果

將如下源環境遷移到目標(副本集)環境

clip_image006

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/27000195/viewspace-1521573/,如需轉載,請註明出處,否則將追究法律責任。

相關文章