【最佳實踐】MongoDB匯出匯入資料

likingzi發表於2023-10-09

首先說一下這個3節點MongoDB叢集各個維度的資料規模:
1、dataSize: 1.9T
2、storageSize: 600G
3、全量備份-加壓縮開關:186G,耗時 8h
4、全量備份-不加壓縮開關:1.8T,耗時 4h27m
具體匯出的語法比較簡單,此處不再贅述,本文重點描述匯入的最佳化過程,最後給出匯入的最佳實踐。

■ 2023-09-13T20:00 第1次4併發匯入測試

mongorestore --port=20000 -uadmin -p'passwd' --authenticationDatabase=admin --numInsertionWorkersPerCollection=4 --bypassDocumentValidation -d likingtest /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest >> 10.2.2.2.log 2>&1 &
tail -100f /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/10.2.2.2.log
以上匯入:
2023-09-13T21:59:55.452+0800    The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
2023-09-13T21:59:55.452+0800    building a list of collections to restore from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest dir
2023-09-13T21:59:55.466+0800    reading metadata for likingtest.oprceConfiguration from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/oprceConfiguration.metadata.json
2023-09-13T21:59:55.478+0800    reading metadata for likingtest.oprceDataObj from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/oprceDataObj.metadata.json
2023-09-13T21:59:55.491+0800    reading metadata for likingtest.oprcesDataObjInit from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/oprcesDataObjInit.metadata.json
2023-09-13T21:59:55.503+0800    reading metadata for likingtest.role from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/role.metadata.json
2023-09-13T21:59:55.508+0800    reading metadata for likingtest.activityConfiguration from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/activityConfiguration.metadata.json
2023-09-13T21:59:55.511+0800    reading metadata for likingtest.history_task from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/history_task.metadata.json
2023-09-13T21:59:55.512+0800    reading metadata for likingtest.resOutRelDataSnapshot from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/resOutRelDataSnapshot.metadata.json
2023-09-13T21:59:55.520+0800    reading metadata for likingtest.snapshotResource from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/snapshotResource.metadata.json
2023-09-13T21:59:55.524+0800    reading metadata for likingtest.oprceDataObjDraft from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/oprceDataObjDraft.metadata.json
2023-09-13T21:59:55.526+0800    reading metadata for likingtest.oprceDataObjInit from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/oprceDataObjInit.metadata.json
2023-09-13T21:59:55.761+0800    restoring likingtest.snapshotResource from /u01/nfs/xxxxx_mongodb/10.1.1.1/20230913/likingtest/snapshotResource.bson
...
2023-09-13T22:00:01.451+0800    [........................]      likingtest.oprceDataObj   408MB/1205GB    (0.0%)
...
2023-09-13T21:59:58.323+0800    finished restoring likingtest.oprceDataObjDraft (1559 documents, 0 failures)
2023-09-13T22:00:01.034+0800    finished restoring likingtest.resOutRelDataSnapshot (34426 documents, 0 failures)
2023-09-13T22:00:01.559+0800    finished restoring likingtest.history_task (3629 documents, 0 failures)
2023-09-13T22:00:02.086+0800    finished restoring likingtest.activityConfiguration (974 documents, 0 failures)
2023-09-13T22:00:02.293+0800    finished restoring likingtest.oprceConfiguration (162 documents, 0 failures)
2023-09-13T22:00:02.529+0800    finished restoring likingtest.oprcesDataObjInit (4 documents, 0 failures)
2023-09-13T22:00:02.857+0800    finished restoring likingtest.role (10 documents, 0 failures)
2023-09-13T22:00:29.153+0800    [########################]  likingtest.snapshotResource  2.04GB/2.04GB  (100.0%)
2023-09-13T22:00:29.155+0800    finished restoring likingtest.snapshotResource (50320 documents, 0 failures)
...
2023-09-14T00:18:58.451+0800    [############............]      likingtest.oprceDataObj  651GB/1205GB   (54.0%)
2023-09-14T00:18:59.857+0800    [########################]  likingtest.oprceDataObjInit  635GB/635GB  (100.0%)
2023-09-14T00:18:59.888+0800    finished restoring likingtest.oprceDataObjInit (43776648 documents, 0 failures)
...
2023-09-14T02:05:58.904+0800    [########################]      likingtest.oprceDataObj  1205GB/1205GB  (100.0%)
2023-09-14T02:05:58.937+0800    finished restoring likingtest.oprceDataObj (53311330 documents, 0 failures)
2023-09-14T02:05:58.945+0800    no indexes to restore for collection likingtest.activityConfiguration
2023-09-14T02:05:58.945+0800    no indexes to restore for collection likingtest.history_task
2023-09-14T02:05:58.945+0800    restoring indexes for collection likingtest.oprcesDataObjInit from metadata
2023-09-14T02:05:58.976+0800    index: &idx.IndexDocument{Options:primitive.M{"name":"flowId_1_activityConfiguration.activityNameEn_1", "ns":"likingtest.oprcesDataObjInit", "v":2}, Key:primitive.D{primitive.E{Key:"flowId", Value:1}, primitive.E{Key:"activityConfiguration.activityNameEn", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800    index: &idx.IndexDocument{Options:primitive.M{"name":"oprceInfo.oprceInstID_1_activityInfo.activityInstID_1_workitemInfo.workItemID_1", "ns":"likingtest.oprcesDataObjInit", "v":2}, Key:primitive.D{primitive.E{Key:"oprceInfo.oprceInstID", Value:1}, primitive.E{Key:"activityInfo.activityInstID", Value:1}, primitive.E{Key:"workitemInfo.workItemID", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800    no indexes to restore for collection likingtest.role
2023-09-14T02:05:58.976+0800    no indexes to restore for collection likingtest.snapshotResource
2023-09-14T02:05:58.976+0800    no indexes to restore for collection likingtest.oprceDataObjDraft
2023-09-14T02:05:58.976+0800    restoring indexes for collection likingtest.oprceDataObjInit from metadata
2023-09-14T02:05:58.976+0800    index: &idx.IndexDocument{Options:primitive.M{"name":"oprceInfo.oprceInstID_1_activityInfo.activityInstID_1_workitemInfo.workItemID_1", "ns":"likingtest.oprceDataObjInit", "v":2}, Key:primitive.D{primitive.E{Key:"oprceInfo.oprceInstID", Value:1}, primitive.E{Key:"activityInfo.activityInstID", Value:1}, primitive.E{Key:"workitemInfo.workItemID", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800    index: &idx.IndexDocument{Options:primitive.M{"name":"flowNo_1", "ns":"likingtest.oprceDataObjInit", "v":2}, Key:primitive.D{primitive.E{Key:"flowNo", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800    no indexes to restore for collection likingtest.oprceConfiguration
2023-09-14T02:05:58.976+0800    no indexes to restore for collection likingtest.resOutRelDataSnapshot
2023-09-14T02:05:58.976+0800    restoring indexes for collection likingtest.oprceDataObj from metadata
2023-09-14T02:05:58.976+0800    index: &idx.IndexDocument{Options:primitive.M{"name":"flowId_1_activityConfiguration.activityNameEn_1", "ns":"likingtest.oprceDataObj", "v":2}, Key:primitive.D{primitive.E{Key:"flowId", Value:1}, primitive.E{Key:"activityConfiguration.activityNameEn",Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800    index: &idx.IndexDocument{Options:primitive.M{"name":"flowNo_1", "ns":"likingtest.oprceDataObj", "v":2}, Key:primitive.D{primitive.E{Key:"flowNo", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800    index: &idx.IndexDocument{Options:primitive.M{"name":"oprceInfo.oprceInstID_1_activityInfo.activityInstID_1_workitemInfo.workItemID_1", "ns":"likingtest.oprceDataObj", "v":2}, Key:primitive.D{primitive.E{Key:"oprceInfo.oprceInstID", Value:1}, primitive.E{Key:"activityInfo.activityInstID", Value:1}, primitive.E{Key:"workitemInfo.workItemID", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T02:05:58.976+0800    index: &idx.IndexDocument{Options:primitive.M{"name":"flowId_1_activityConfiguration.activityNameEn_1", "ns":"likingtest.oprceDataObjInit", "v":2}, Key:primitive.D{primitive.E{Key:"flowId", Value:1}, primitive.E{Key:"activityConfiguration.activityNameEn", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-14T03:45:47.152+0800    97179062 document(s) restored successfully. 0 document(s) failed to restore.

可見:
1、配置併發引數 --numInsertionWorkersPerCollection=4 和 檢查引數 bypassDocumentValidation 後,restore速度大大提升,1.2T 的一個大集合 oprceDataObj,由原來預設restore方式約 12h,降為:4h
2、restore完所有資料以後,最後再restore索引,restore索引還是需要一定的時間,本次耗時:1h40m【注:實際沒有成功,索引並未生效】
3、新版本的 -d -c 引數需統一修改為:--nsInclude --nsFrom= --nsTo=

■ 2023-09-14T10:40 第2次8併發匯入測試

mongorestore --port=20000 -uadmin -p'passwd' --authenticationDatabase=admin --numInsertionWorkersPerCollection=8 --bypassDocumentValidation -d likingtest /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914/likingtest >> 10.2.2.2.log 2>&1 &
tail -100f /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914/10.2.2.2.log
---
2023-09-14T10:40:45.492+0800    The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
...
2023-09-14T10:40:48.493+0800    [........................]       likingtest.oprceDataObj   112MB/1208GB    (0.0%)
...
2023-09-14T12:57:34.859+0800    [########################]       likingtest.oprceDataObj  1208GB/1208GB  (100.0%)
2023-09-14T12:57:34.867+0800    finished restoring likingtest.oprceDataObj (53413481 documents, 0 failures)

可見:
1、配置併發引數 --numInsertionWorkersPerCollection=8 和 檢查引數 --bypassDocumentValidation 後,restore速度再次大大提升,1.2T的一個大集合 oprceDataObj,由原來預設restore方式約 12h,降為:2h17m
2、本次恢復採用nfs備份恢復,一臺8C的虛機,8併發恢復時cpu佔用約40%,網路接收速度300MB/s左右,本地磁碟寫入速度在30-200MB/s左右,可見網路帶段不是瓶頸。可以預見,如果採用更高的主機配置,尤其是IO更好的磁碟,resotore時間必將更少。

■ 2023-09-14T16:10 第3次12併發匯入測試

【注意】由於新版本mongorestore摒棄了-d -c引數,雖然可用但使用不夠靈活,因此需使用新引數--nsInclude,對於該引數的使用,摸索了多次才找到使用的限制條件,即 directory 必須為資料庫備份的根目錄/上一級目錄,而不是 資料庫目錄!即類似 dumpdir/20230914,而不是 dumpdir/20230914/database!這是一個巨大的坑,切記!當然,這個目錄下一定不能有其他不可識別的檔案,否則也會報錯。

mongorestore --port=20000 -uadmin -p'passwd' --authenticationDatabase=admin --numInsertionWorkersPerCollection=12 --bypassDocumentValidation --nsInclude="likingtest.*" /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914 > 20230914.10.2.2.2-3.log 2>&1 &
tail -100f /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914.10.2.2.2-3.log
---
2023-09-14T16:10:19.245+0800    preparing collections to restore from
...
2023-09-14T18:18:18.996+0800    [########################]  likingtest.oprceDataObj  1208GB/1208GB  (100.0%)
2023-09-14T18:18:19.014+0800    finished restoring likingtest.oprceDataObj (53413481 documents, 0 failures)

可見:
1、併發由 8 增至 12 並無效率提升,結論是 6-8 個併發就可以,這一點與oracle的併發匯入設定為 6 基本是最佳實踐類似。
2、本次恢復採用nfs備份恢復,一臺8C的虛機,12併發恢復時cpu佔用約60%,網路接收速度300MB/s左右,本地磁碟寫入速度在30-500MB/s左右,可見網路帶段不是瓶頸。可以預見,如果採用更高的主機配置,尤其是IO更好的磁碟,resotore時間必將更少。
3、關於索引的restore,restore時首先恢復資料,最後再建立索引,比較大的集合的索引建立還是需要較多的時間:

      currentOpTime: '2023-09-14T20:23:59.435+08:00',
...
      command: {
        createIndexes: 'oprceDataObj',
        indexes: [
          {
            key: { flowId: 1, 'activityConfiguration.activityNameEn': 1 },
            name: 'flowId_1_activityConfiguration.activityNameEn_1',
            ns: 'likingtest.oprceDataObj'
          },
          {
            key: { flowNo: 1 },
            name: 'flowNo_1',
            ns: 'likingtest.oprceDataObj'
          },
          {
            key: {
              'oprceInfo.oprceInstID': 1,
              'activityInfo.activityInstID': 1,
              'workitemInfo.workItemID': 1
            },
            name: 'oprceInfo.oprceInstID_1_activityInfo.activityInstID_1_workitemInfo.workItemID_1',
            ns: 'likingtest.oprceDataObj'
          }
        ],
.....
      currentOpTime: '2023-09-14T20:23:59.489+08:00',
...
      command: {
        createIndexes: 'oprcesDataObjInit',
        indexes: [
          {
            key: { flowId: 1, 'activityConfiguration.activityNameEn': 1 },
            name: 'flowId_1_activityConfiguration.activityNameEn_1',
            ns: 'likingtest.oprcesDataObjInit'
          },
          {
            key: {
              'oprceInfo.oprceInstID': 1,
              'activityInfo.activityInstID': 1,
              'workitemInfo.workItemID': 1
            },
            name: 'oprceInfo.oprceInstID_1_activityInfo.activityInstID_1_workitemInfo.workItemID_1',
            ns: 'likingtest.oprcesDataObjInit'
          }
        ],
......第二天再看,還沒建立完索引:
      currentOpTime: '2023-09-15T09:16:16.460+08:00',
      effectiveUsers: [ { user: 'admin', db: 'admin' } ],
      runBy: [ { user: '__system', db: 'local' } ],
      threaded: true,
      opid: 'shard1:11312917',
      lsid: {
        id: new UUID("e78379ff-9664-46b1-9e87-2bdd4abc5c5f"),
        uid: Binary.createFromBase64("O0CMtIVItQN4IsEOsJdrPL8s7jv5xwh5a/A5Qfvs2A8=", 0)
      },
      secs_running: Long("53877"),
      microsecs_running: Long("53877330742"),
      op: 'command',
      ns: 'likingtest.oprcesDataObjInit',
      redacted: false,
      command: {
        createIndexes: 'oprcesDataObjInit',
......第二天滿24h,還沒建立完索引:
      currentOpTime: '2023-09-15T18:55:16.877+08:00',
      effectiveUsers: [ { user: 'admin', db: 'admin' } ],
      runBy: [ { user: '__system', db: 'local' } ],
      threaded: true,
      opid: 'shard1:11312917',
      lsid: {
        id: new UUID("e78379ff-9664-46b1-9e87-2bdd4abc5c5f"),
        uid: Binary.createFromBase64("O0CMtIVItQN4IsEOsJdrPL8s7jv5xwh5a/A5Qfvs2A8=", 0)
      },
      secs_running: Long("88617"),
      microsecs_running: Long("88617747875"),
      op: 'command',
      ns: 'likingtest.oprcesDataObjInit',
      redacted: false,
      command: {
        createIndexes: 'oprcesDataObjInit',
        indexes: [
          {
            key: { flowId: 1, 'activityConfiguration.activityNameEn': 1 },
            name: 'flowId_1_activityConfiguration.activityNameEn_1',
            ns: 'likingtest.oprcesDataObjInit'
          },

以上可見,mongorestore 匯入資料庫的資料效率目前是基本可控、可接受的,至少對於1.2T的大集合是可以接受的,但是最後的索引建立實在過於緩慢,且沒有找到合適的解決辦法:索引需多併發執行建立,且確保索引生效,本次索引建立最後並未生效

■ 2023-09-15T19:02 第4次10併發匯入測試,不恢復索引

mongorestore --port=20000 -uadmin -p'passwd' --authenticationDatabase=admin --numInsertionWorkersPerCollection=10 --bypassDocumentValidation --nsInclude="likingtest.*" --nsFrom="likingtest.*" --nsTo="likingtest.*" --noIndexRestore /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914 > 20230914.10.2.2.2-4.log 2>&1 &
tail -100f /u01/nfs/xxxxx_mongodb/10.1.1.1/20230914.10.2.2.2-4.log
2023-09-15T19:02:59.747+0800    preparing collections to restore from
...
2023-09-15T21:24:36.145+0800    [########################]  likingtest.oprceDataObj  1208GB/1208GB  (100.0%)
2023-09-15T21:24:36.161+0800    finished restoring likingtest.oprceDataObj (53413481 documents, 0 failures)
2023-09-15T21:24:36.165+0800    97367732 document(s) restored successfully. 0 document(s) failed to restore.

以上可見,耗時:2h22m

結論

1、restore 時需設定大資料量 collection 多併發匯入:--numInsertionWorkersPerCollection=8
2、不恢復索引:--noIndexRestore
3、資料恢復後,後臺建立索引:本站搜尋"MongoDB 重建索引"

相關文章