分散式文件儲存資料庫之MongoDB備份與恢復

1874發表於2020-11-16

  前文我們聊了下mongodb的訪問控制以及使用者建立和角色分配,回顧請參考https://www.cnblogs.com/qiuhom-1874/p/13974656.html;今天我們來了解下mongodb的備份與恢復

  為什麼要備份?

  備份的目的是對資料做冗餘的一種方式,它能夠讓我們在某種情況下保證最少資料的丟失;之前我們對mongodb做副本集也是對資料做冗餘,但是這種在副本集上做資料冗餘僅僅是針對系統故障或服務異常等一些非人為的故障發生時,保證資料服務的可用性;它不能夠避免人為的誤操作;為了使得資料的安全,將資料損失降低到最小,我們必須對資料庫週期性的做備份;

  常用備份方法

  提示:上圖主要描述了mongodb資料庫上常用備份策略,我們可以邏輯備份,邏輯備份是將資料庫中的資料匯出成語句,通常使用專用工具匯出和匯入來完成一次備份與恢復;其次我們也可以物理備份,簡單講物理備份就是把資料庫檔案打包,備份;恢復時直接將對應的資料庫檔案解壓恢復即可;另外一種快速物理備份的方式就是給資料拍快照,拍快照可以將資料儲存為當前拍快照時的狀態;如果我們要進行恢復直接恢復快照即可;

  mongodb邏輯備份和物理備份比較

  提示:從上圖描述可以看出總體上來看物理備份效率和恢復效率要高於邏輯;物理備份效率高於邏輯備份,其主要原因是邏輯備份是通過資料庫介面將資料讀取出來,然後儲存為對應格式的檔案,而物理備份只需要將資料檔案直接打包備份,不需要一條一條的讀取資料,然後寫入到其他檔案,這中間就省去了讀寫過程,所以物理備份效率高;恢復也是類似的過程,物理恢復也是省去了讀寫的過程;

  mongodb邏輯備份工具

  在mongodb中使用邏輯備份的工具有兩組,第一組是mongodump/mongorestore,使用mongodump/mongorestore這組工具來邏輯的備份資料,它備份出來的資料是BSON格式,BSON是一種二進位制格式,通常無法使用文字編輯器直接開啟檢視其內容,對人類的可讀性較差,但它的優點是儲存的檔案體積要小;使用這組命令匯出的資料,在恢復是依賴mongodb版本,不同版本匯出的BSON格式略有不同,所以恢復時,可能存在版本不同而導致恢復資料失敗的情況;另外一組是mongoexport/mongoimport,這組工具匯出的資料是json格式的資料,通常我們可以使用文字編輯器開啟直接檢視,對人類的可讀性較好,但體積相對BSON格式的資料要大,恢復時不依賴版本;所以跨版本備份要先檢視下對應版本的相容性,如果相容使用mongodump/mongorestore,不相容的話建議使用mongoexport/mongoimport;這裡需要注意一點,JSON格式雖然可讀性很好,也很通用,但是它只是保留了資料部分,而沒有保留索引,賬戶等基礎資訊,在使用是應該注意;

  使用mongodump備份資料

  插入資料

> use testdb
switched to db testdb
> for(i=1;i<=1000;i++) db.test.insert({id:i,name:"test"+i,age:(i%120),classes:(i%25)})
WriteResult({ "nInserted" : 1 })
> show tables
test
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> db.test.count()
1000
> 

  備份所有資料庫

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -o ./node12_mongodb_full_backup
2020-11-15T21:47:45.439+0800    writing admin.system.users to node12_mongodb_full_backup/admin/system.users.bson
2020-11-15T21:47:45.442+0800    done dumping admin.system.users (4 documents)
2020-11-15T21:47:45.443+0800    writing admin.system.version to node12_mongodb_full_backup/admin/system.version.bson
2020-11-15T21:47:45.447+0800    done dumping admin.system.version (2 documents)
2020-11-15T21:47:45.448+0800    writing testdb.test to node12_mongodb_full_backup/testdb/test.bson
2020-11-15T21:47:45.454+0800    done dumping testdb.test (1000 documents)
[root@node11 ~]# ls
node12_mongodb_full_backup
[root@node11 ~]# ll node12_mongodb_full_backup/
total 0
drwxr-xr-x 2 root root 128 Nov 15 21:47 admin
drwxr-xr-x 2 root root  49 Nov 15 21:47 testdb
[root@node11 ~]# tree node12_mongodb_full_backup/
node12_mongodb_full_backup/
├── admin
│   ├── system.users.bson
│   ├── system.users.metadata.json
│   ├── system.version.bson
│   └── system.version.metadata.json
└── testdb
    ├── test.bson
    └── test.metadata.json

2 directories, 6 files
[root@node11 ~]# 

  提示:-u用於指定使用者,-p指定對應使用者的密碼,-h指定資料庫地址,--authenticationDatabase 指定驗證使用者和密碼對應的資料庫 -o指定要存放備份檔案的目錄名稱;

  只備份單個testdb資料庫

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -o ./node12_testdb
2020-11-15T21:53:36.523+0800    writing testdb.test to node12_testdb/testdb/test.bson
2020-11-15T21:53:36.526+0800    done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_testdb
./node12_testdb
└── testdb
    ├── test.bson
    └── test.metadata.json

1 directory, 2 files
[root@node11 ~]# 

  提示:-d使用者指定要備份的資料庫名稱;

  只備份testdb下的test集合

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test -o ./node12_testdb_test-collection
2020-11-15T21:55:48.217+0800    writing testdb.test to node12_testdb_test-collection/testdb/test.bson
2020-11-15T21:55:48.219+0800    done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_testdb_test-collection
./node12_testdb_test-collection
└── testdb
    ├── test.bson
    └── test.metadata.json

1 directory, 2 files
[root@node11 ~]# 

  提示:-c用於指定要備份的集合(collection)名稱;

  壓縮備份testdb庫

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --gzip -o ./node12_mongodb_testdb-gzip 
2020-11-15T22:00:52.268+0800    writing testdb.test to node12_mongodb_testdb-gzip/testdb/test.bson.gz
2020-11-15T22:00:52.273+0800    done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_mongodb_testdb-gzip
./node12_mongodb_testdb-gzip
└── testdb
    ├── test.bson.gz
    └── test.metadata.json.gz

1 directory, 2 files
[root@node11 ~]# 

  提示:可以看到使用壓縮,只需要加上--gzip選項即可,備份出來的資料就是.gz字尾結尾的壓縮檔案;

  壓縮備份testdb庫下的test集合

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test --gzip -o ./node12_mongodb_testdb-test-gzip 
2020-11-15T22:01:31.492+0800    writing testdb.test to node12_mongodb_testdb-test-gzip/testdb/test.bson.gz
2020-11-15T22:01:31.500+0800    done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_mongodb_testdb-test-gzip
./node12_mongodb_testdb-test-gzip
└── testdb
    ├── test.bson.gz
    └── test.metadata.json.gz

1 directory, 2 files
[root@node11 ~]# 

  使用mongorestore恢復資料

  在node12上刪除testdb

> db
testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
> 

  全量恢復所有資料庫

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --drop ./node12_mongodb_full_backup
2020-11-15T22:07:35.465+0800    preparing collections to restore from
2020-11-15T22:07:35.467+0800    reading metadata for testdb.test from node12_mongodb_full_backup/testdb/test.metadata.json
2020-11-15T22:07:35.475+0800    restoring testdb.test from node12_mongodb_full_backup/testdb/test.bson
2020-11-15T22:07:35.486+0800    no indexes to restore
2020-11-15T22:07:35.486+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:07:35.486+0800    restoring users from node12_mongodb_full_backup/admin/system.users.bson
2020-11-15T22:07:35.528+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]#

  提示:--drop用於指定,恢復是如果對應資料庫或者colleciton存在,則先刪除然後在恢復,這樣做的目的是保證恢復的資料和備份的資料一致;

  驗證:登入192.168.0.52:27017檢視對應testdb資料庫是否恢復?

[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin 
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("af96cb64-a2a4-4d59-b60a-86ccbbe77e3e") }
MongoDB server version: 4.4.1
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
        https://docs.mongodb.com/
Questions? Try the MongoDB Developer Community Forums
        https://community.mongodb.com
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  恢復單個庫

  刪除testdb庫

> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
> 

  使用mongorestore恢復testdb庫

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --drop ./node12_testdb/testdb/
2020-11-15T22:29:03.718+0800    The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
2020-11-15T22:29:03.718+0800    building a list of collections to restore from node12_testdb/testdb dir
2020-11-15T22:29:03.719+0800    reading metadata for testdb.test from node12_testdb/testdb/test.metadata.json
2020-11-15T22:29:03.736+0800    restoring testdb.test from node12_testdb/testdb/test.bson
2020-11-15T22:29:03.755+0800    no indexes to restore
2020-11-15T22:29:03.755+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:29:03.755+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin 
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("f5e73939-bb87-4d45-bf80-9ff1e7f6f15d") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show tables
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  恢復單個集合

  刪除testdb下的test集合

> db
testdb
> show collections
test
> db.test.drop()
true
> show collections
> 

  使用mongorestore恢復testdb下的test集合

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test --drop ./node12_testdb_test-collection/testdb/test.bson 
2020-11-15T22:36:15.615+0800    checking for collection data in node12_testdb_test-collection/testdb/test.bson
2020-11-15T22:36:15.616+0800    reading metadata for testdb.test from node12_testdb_test-collection/testdb/test.metadata.json
2020-11-15T22:36:15.625+0800    restoring testdb.test from node12_testdb_test-collection/testdb/test.bson
2020-11-15T22:36:15.669+0800    no indexes to restore
2020-11-15T22:36:15.669+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:36:15.669+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin                                              MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("27d15d9e-3fdf-4efc-b871-1ec6716e51e3") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  使用壓縮檔案恢復資料庫

  刪除testdb資料庫

> db
testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
> 

  使用mongorestore工具載入壓縮檔案恢復資料庫

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --gzip --drop ./node12_mongodb_testdb-gzip/testdb/
2020-11-15T22:39:55.313+0800    The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
2020-11-15T22:39:55.313+0800    building a list of collections to restore from node12_mongodb_testdb-gzip/testdb dir
2020-11-15T22:39:55.314+0800    reading metadata for testdb.test from node12_mongodb_testdb-gzip/testdb/test.metadata.json.gz
2020-11-15T22:39:55.321+0800    restoring testdb.test from node12_mongodb_testdb-gzip/testdb/test.bson.gz
2020-11-15T22:39:55.332+0800    no indexes to restore
2020-11-15T22:39:55.332+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:39:55.332+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin                                              MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("73d98c33-f8f7-40e3-89bd-fda8c702e407") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  提示:使用mongorestore恢復單個庫使用-d選線指定要恢復的資料庫,恢復單個集合使用-c指定集合名稱即可,以及使用壓縮檔案恢復加上對應的--gzip選項即可,總之,備份時用的選項在恢復時也應當使用對應的選項,這個mongodump備份使用的選項沒有特別的不同;

  使用mongoexport備份資料

  新建peoples資料庫,並向peoples_info集合中插入資料

> use peoples
switched to db peoples
> for(i=1;i<=10000;i++) db.peoples_info.insert({id:i,name:"peoples"+i,age:(i%120),classes:(i%25)})
WriteResult({ "nInserted" : 1 })
> show dbs
admin    0.000GB
config   0.000GB
local    0.000GB
peoples  0.000GB
testdb   0.000GB
> db.peoples_info.count()
10000
> db.peoples_info.findOne()
{
        "_id" : ObjectId("5fb13f35012870b3c8e3c895"),
        "id" : 1,
        "name" : "peoples1",
        "age" : 1,
        "classes" : 1
}
> 

  使用mongoexport工具peoples庫下的peoples_info集合

[root@node11 ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type json -o ./peoples-peopels_info.json
2020-11-15T22:54:18.287+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:54:18.370+0800    exported 10000 records
[root@node11 ~]# ll
total 1004
-rw-r--r-- 1 root root 1024609 Nov 15 22:54 peoples-peopels_info.json
[root@node11 ~]# head -n 1 peoples-peopels_info.json 
{"_id":{"$oid":"5fb13f35012870b3c8e3c895"},"id":1.0,"name":"peoples1","age":1.0,"classes":1.0}
[root@node11 ~]# 

  提示:使用--type可以指定匯出資料檔案的格式,預設是json格式,當然也可以指定csv格式;這裡還需要注意mongoexport這個工具匯出資料必須要指定資料庫和對應集合,它不能直接對整個資料庫下的所有集合做匯出;只能單個單個的導;

  匯出csv格式的資料檔案

[root@node11 ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type csv -o ./peoples-peopels_info.csv
2020-11-15T22:58:30.495+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:58:30.498+0800    Failed: CSV mode requires a field list
[root@node11 ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type csv -f id,name,age -o ./peoples-peopels_info.csv  
2020-11-15T22:59:26.090+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:59:26.143+0800    exported 10000 records
[root@node11 ~]# head -n 1 ./peoples-peopels_info.csv
id,name,age
[root@node11 ~]# head  ./peoples-peopels_info.csv    
id,name,age
1,peoples1,1
2,peoples2,2
3,peoples3,3
4,peoples4,4
5,peoples5,5
6,peoples6,6
7,peoples7,7
8,peoples8,8
9,peoples9,9
[root@node11 ~]# 

  提示:匯出指定格式為csv時,必須用-f選項指定匯出的欄位名稱,分別用逗號隔開;

  將資料匯入到node11的mongodb上

  匯入json格式資料

[root@node11 ~]# systemctl start mongod.service 
[root@node11 ~]# ss -tnl
State      Recv-Q Send-Q         Local Address:Port                        Peer Address:Port              
LISTEN     0      128                        *:22                                     *:*                  
LISTEN     0      100                127.0.0.1:25                                     *:*                  
LISTEN     0      128                127.0.0.1:27017                                  *:*                  
LISTEN     0      128                       :::22                                    :::*                  
LISTEN     0      100                      ::1:25                                    :::*                  
[root@node11 ~]# ll
total 1200
-rw-r--r-- 1 root root  198621 Nov 15 22:59 peoples-peopels_info.csv
-rw-r--r-- 1 root root 1024609 Nov 15 22:54 peoples-peopels_info.json
[root@node11 ~]# mongoimport  -d testdb -c peoples_info --drop peoples-peopels_info.json 
2020-11-15T23:05:03.004+0800    connected to: mongodb://localhost/
2020-11-15T23:05:03.005+0800    dropping: testdb.peoples_info
2020-11-15T23:05:03.186+0800    10000 document(s) imported successfully. 0 document(s) failed to import.
[root@node11 ~]#

  提示:匯入資料時可以任意指定資料庫以及集合名稱;

  驗證:檢視node11上的testdb庫下是否有peoples_info集合?集合中是否有資料呢?

[root@node11 ~]# mongo
MongoDB shell version v4.4.1
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("4e3a00b0-8367-4b3a-9a77-e61d03bb1b3d") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T23:03:39.669+08:00: ***** SERVER RESTARTED *****
        2020-11-15T23:03:40.681+08:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
        2020-11-15T23:03:40.681+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T23:03:40.681+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
peoples_info
> db.peoples_info.count()
10000
> db.peoples_info.findOne()
{
        "_id" : ObjectId("5fb13f35012870b3c8e3c895"),
        "id" : 1,
        "name" : "peoples1",
        "age" : 1,
        "classes" : 1
}
> 

  匯入csv格式資料到node12上的testdb庫下的test1集合中去

[root@node11 ~]# mongoimport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test1 --type csv --headerline --file ./peoples-peopels_info.csv 
2020-11-15T23:11:42.595+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T23:11:42.692+0800    10000 document(s) imported successfully. 0 document(s) failed to import.
[root@node11 ~]#

  提示:匯入csv格式的資料需要明確指定型別為csv,然後使用--headerline指定不匯入第一行列名,--file使用用於指定csv格式檔案的名稱;

  驗證:登入node12的mongodb,檢視testdb庫下是否有test1集合?對應集合是否有資料呢?

[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("72a07318-ac04-46f9-a310-13b1241d2f77") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin    0.000GB
config   0.000GB
local    0.000GB
peoples  0.000GB
testdb   0.000GB
> use testdb
switched to db testdb
> show collections
test
test1
> db.test1.count()
10000
> db.test1.findOne()
{
        "_id" : ObjectId("5fb1452ef09b563b65405f7c"),
        "id" : 1,
        "name" : "peoples1",
        "age" : 1
}
> 

  提示:可以看到testdb庫下的test1結合就沒有classes欄位資訊了,這是因為我們匯出資料時沒有指定要匯出classes欄位,所以匯入的資料當然也是沒有classes欄位資訊;以上就是mongodump/mongorestore和mongoexport/mongoimport工具的使用和測試;

  全量備份加oplog實現恢復mongodb資料庫到指定時間點的資料

  在mongodump備份資料時,我們可以使用--oplog選項來記錄開始dump資料到dump資料結束後的中間一段時間mongodb資料發生變化的日誌;我們知道oplog就是用來記錄mongodb中的集合寫操作的日誌,類似mysql中的binlog;我們可以使用oplog將備份期間發生變化的資料一起恢復,這樣恢復出來的資料才是我們真正備份時的所有資料;

  模擬備份時,一邊插入資料,一邊備份資料

test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> for(i=1;i<=1000000;i++) db.test3.insert({id:i,name:"test3-oplog"+i,commit:"test3"+i})

  

  在另外一邊同時對資料做備份

[root@node11 ~]# rm -rf *
[root@node11 ~]# ll
total 0
[root@node11 ~]#  mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --oplog -o ./alldatabase
2020-11-15T23:51:40.606+0800    writing admin.system.users to alldatabase/admin/system.users.bson
2020-11-15T23:51:40.606+0800    done dumping admin.system.users (4 documents)
2020-11-15T23:51:40.607+0800    writing admin.system.version to alldatabase/admin/system.version.bson
2020-11-15T23:51:40.608+0800    done dumping admin.system.version (2 documents)
2020-11-15T23:51:40.609+0800    writing testdb.test1 to alldatabase/testdb/test1.bson
2020-11-15T23:51:40.611+0800    writing testdb.test3 to alldatabase/testdb/test3.bson
2020-11-15T23:51:40.612+0800    writing testdb.test to alldatabase/testdb/test.bson
2020-11-15T23:51:40.612+0800    writing peoples.peoples_info to alldatabase/peoples/peoples_info.bson
2020-11-15T23:51:40.696+0800    done dumping peoples.peoples_info (10000 documents)
2020-11-15T23:51:40.761+0800    done dumping testdb.test3 (54167 documents)
2020-11-15T23:51:40.803+0800    done dumping testdb.test (31571 documents)
2020-11-15T23:51:40.966+0800    done dumping testdb.test1 (79830 documents)
2020-11-15T23:51:40.972+0800    writing captured oplog to 
2020-11-15T23:51:40.980+0800            dumped 916 oplog entries
[root@node11 ~]# ll
total 0
drwxr-xr-x 5 root root 66 Nov 15 23:51 alldatabase
[root@node11 ~]# tree alldatabase/
alldatabase/
├── admin
│   ├── system.users.bson
│   ├── system.users.metadata.json
│   ├── system.version.bson
│   └── system.version.metadata.json
├── oplog.bson
├── peoples
│   ├── peoples_info.bson
│   └── peoples_info.metadata.json
└── testdb
    ├── test1.bson
    ├── test1.metadata.json
    ├── test3.bson
    ├── test3.metadata.json
    ├── test.bson
    └── test.metadata.json

3 directories, 13 files
[root@node11 ~]# 

  提示:可以看到現在備份就多了一個oplog.bson;

  檢視oplog.bson中第一行記錄的資料和第二行記錄的資料

[root@node11 ~]# ls
alldatabase
[root@node11 ~]# cd alldatabase/
[root@node11 alldatabase]# ls
admin  oplog.bson  peoples  testdb
[root@node11 alldatabase]# bsondump oplog.bson > /tmp/oplog.bson.tmp
2020-11-15T23:55:04.801+0800    916 objects found
[root@node11 alldatabase]# head -n 1 /tmp/oplog.bson.tmp
{"op":"i","ns":"testdb.test3","ui":{"$binary":{"base64":"7PmE47CASOiQZt5sMGDZKw==","subType":"04"}},"o":{"_id":{"$oid":"5fb14e8c01fff06b2b50a2ac"},"id":{"$numberDouble":"54101.0"},"name":"test3-oplog54101","commit":"test354101"},"ts":{"$timestamp":{"t":1605455500,"i":1880}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605455500608"}},"v":{"$numberLong":"2"}}
[root@node11 alldatabase]# tail -n 1 /tmp/oplog.bson.tmp
{"op":"i","ns":"testdb.test3","ui":{"$binary":{"base64":"7PmE47CASOiQZt5sMGDZKw==","subType":"04"}},"o":{"_id":{"$oid":"5fb14e8c01fff06b2b50a63f"},"id":{"$numberDouble":"55016.0"},"name":"test3-oplog55016","commit":"test355016"},"ts":{"$timestamp":{"t":1605455500,"i":2795}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605455500961"}},"v":{"$numberLong":"2"}}
[root@node11 alldatabase]# 

  提示:可以看到oplog中記錄了id為54101-55016資料,這也就說明了我們開始dump資料時,到dump結束後,資料一致在發生變化,所以我們dump下來的資料是一箇中間狀態的資料;這裡需要說明一點使用mongodump --oplog選項時,不能指定庫,因為oplog是對所有庫,而不針對某個庫記錄,所以--oplog只有在備份所有資料庫生效;

  刪除testdb資料庫,然後基於我們剛才dump的資料做資料恢復

test_replset:PRIMARY> show dbs
admin    0.000GB
config   0.000GB
local    0.014GB
peoples  0.000GB
testdb   0.019GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> db.dropDatabase()
{
        "dropped" : "testdb",
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1605456134, 4),
                "signature" : {
                        "hash" : BinData(0,"cRAdXcUj5c48Q77rCJ1DeeF10u8="),
                        "keyId" : NumberLong("6895378399531892740")
                }
        },
        "operationTime" : Timestamp(1605456134, 4)
}
test_replset:PRIMARY> show dbs
admin    0.000GB
config   0.000GB
local    0.014GB
peoples  0.000GB
test_replset:PRIMARY> 

  使用mongorestore恢復資料

[root@node11 ~]# ls
alldatabase
[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --oplogReplay --drop ./alldatabase/
2020-11-16T00:06:32.049+0800    preparing collections to restore from
2020-11-16T00:06:32.053+0800    reading metadata for testdb.test1 from alldatabase/testdb/test1.metadata.json
2020-11-16T00:06:32.060+0800    reading metadata for testdb.test3 from alldatabase/testdb/test3.metadata.json
2020-11-16T00:06:32.064+0800    reading metadata for testdb.test from alldatabase/testdb/test.metadata.json
2020-11-16T00:06:32.064+0800    restoring testdb.test1 from alldatabase/testdb/test1.bson
2020-11-16T00:06:32.074+0800    restoring testdb.test3 from alldatabase/testdb/test3.bson
2020-11-16T00:06:32.093+0800    restoring testdb.test from alldatabase/testdb/test.bson
2020-11-16T00:06:32.098+0800    reading metadata for peoples.peoples_info from alldatabase/peoples/peoples_info.metadata.json
2020-11-16T00:06:32.110+0800    restoring peoples.peoples_info from alldatabase/peoples/peoples_info.bson
2020-11-16T00:06:32.333+0800    no indexes to restore
2020-11-16T00:06:32.333+0800    finished restoring peoples.peoples_info (10000 documents, 0 failures)
2020-11-16T00:06:32.766+0800    no indexes to restore
2020-11-16T00:06:32.766+0800    finished restoring testdb.test (31571 documents, 0 failures)
2020-11-16T00:06:33.023+0800    no indexes to restore
2020-11-16T00:06:33.023+0800    finished restoring testdb.test3 (54167 documents, 0 failures)
2020-11-16T00:06:33.370+0800    no indexes to restore
2020-11-16T00:06:33.370+0800    finished restoring testdb.test1 (79830 documents, 0 failures)
2020-11-16T00:06:33.370+0800    restoring users from alldatabase/admin/system.users.bson
2020-11-16T00:06:33.416+0800    replaying oplog
2020-11-16T00:06:33.850+0800    applied 916 oplog entries
2020-11-16T00:06:33.850+0800    175568 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# 

  提示:恢復是需要使用--oplogReplay選項來指定重放oplog.bson中的內容;從上面恢復日誌可以看到從oplog中恢復了916條資料;也就是說從dump資料開始的那一刻開始到dump結束期間有916條資料發生變化;

  驗證:連線資料庫,看看對應的testdb庫下的test3集合恢復了多少條資料?

test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> show tables
test
test1
test3
test_replset:PRIMARY> db.test3.count()
55016
test_replset:PRIMARY> 

  提示:可以看到test3集合恢復了55016條資料;剛好可以和oplog.bson中的最後一條資料的id對應起來;

  備份oplog.rs實現指定恢復到某個時間節點

  為了演示容易看出效果,我這裡從新將資料庫清空,關閉了認證功能

  插入資料

test_replset:PRIMARY> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> for(i=1;i<=100000;i++) db.test.insert({id:(i+10000),name:"test-oplog"+i,commit:"test"+i})

  同時備份資料,這次不加--oplog選項

[root@node11 ~]# ll
total 0
[root@node11 ~]# mongodump -h node12:27017  -o ./alldatabase
2020-11-16T09:38:00.921+0800	writing admin.system.version to alldatabase/admin/system.version.bson
2020-11-16T09:38:00.923+0800	done dumping admin.system.version (1 document)
2020-11-16T09:38:00.924+0800	writing testdb.test to alldatabase/testdb/test.bson
2020-11-16T09:38:00.960+0800	done dumping testdb.test (16377 documents)
[root@node11 ~]# ll 
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
[root@node11 ~]# tree ./alldatabase
./alldatabase
├── admin
│   ├── system.version.bson
│   └── system.version.metadata.json
└── testdb
    ├── test.bson
    └── test.metadata.json

2 directories, 4 files
[root@node11 ~]# 

  提示:我們在一邊插入資料,一邊備份資料,從上面的被日誌可以看到,我們備份testdb庫下的test集合16377條資料,很顯然這不是testdb.test集合的所有資料;我們備份的只是部分資料;正常情況等資料插入完成以後,testdb.test集合應該有100000條資料;

  驗證:檢視testdb.test集合是否有100000條資料?

test_replset:PRIMARY> db
testdb
test_replset:PRIMARY> show collections
test
test_replset:PRIMARY> db.test.count()
100000
test_replset:PRIMARY> 

  模擬誤操作刪除testdb.test集合所有資料

test_replset:PRIMARY> db
testdb
test_replset:PRIMARY> show collections
test
test_replset:PRIMARY> db.test.remove({})
WriteResult({ "nRemoved" : 100000 })
test_replset:PRIMARY> 

  提示:現在我們不小心把testdb.test集合給刪除了,現在如果用之前的備份肯定只能恢復部分資料,怎麼辦呢?我們這個時候可以匯出oplog.rs集合,這個集合就是oplog存放資料的集合,它位於local庫下;

  備份local庫中的oplog.rs集合

[root@node11 ~]# ll
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
[root@node11 ~]# mongodump -h node12:27017 -d local -c oplog.rs -o ./oplog-rs
2020-11-16T09:43:38.594+0800	writing local.oplog.rs to oplog-rs/local/oplog.rs.bson
2020-11-16T09:43:38.932+0800	done dumping local.oplog.rs (200039 documents)
[root@node11 ~]# ll
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
drwxr-xr-x 3 root root 19 Nov 16 09:43 oplog-rs
[root@node11 ~]# tree ./oplog-rs
./oplog-rs
└── local
    ├── oplog.rs.bson
    └── oplog.rs.metadata.json

1 directory, 2 files
[root@node11 ~]# 

  提示:oplog存放在local庫下的oplog.rs集合中,以上操作就是備份所有的oplog;現在我們準備好一個oplog,但是現在還不能直接恢復,如果直接恢復,我們的誤操作也會跟著一起重放沒有任何意義,現在我們需要找到誤操作的時間點,然後在恢復;

  在oplog中查詢誤刪除的時間

[root@node11 ~]# bsondump oplog-rs/local/oplog.rs.bson |egrep "\"op\":\"d\""|head -n 3
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa146"}},"ts":{"$timestamp":{"t":1605490915,"i":1}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915399"}},"v":{"$numberLong":"2"}}
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa147"}},"ts":{"$timestamp":{"t":1605490915,"i":2}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915400"}},"v":{"$numberLong":"2"}}
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa148"}},"ts":{"$timestamp":{"t":1605490915,"i":3}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915400"}},"v":{"$numberLong":"2"}}
2020-11-16T09:46:20.363+0800	100074 objects found
2020-11-16T09:46:20.363+0800	write /dev/stdout: broken pipe
[root@node11 ~]# 

  提示:我們要恢復到第一次刪除前的資料,我們就選擇第一條日誌中的$timestamp欄位中的{"t":1605490915,"i":1};這個就是我們第一次刪除的時間資訊;

  複製oplog.rs.bson到備份的資料目錄為oplog.bson,模擬出使用--oplog選項備份的備份環境

[root@node11 ~]# cp ./oplog-rs/local/oplog.rs.bson ./alldatabase/oplog.bson
[root@node11 ~]# 

  在使用mongorestore進行恢復資料,指定恢復到第一次刪除資料前的時間點所有資料

[root@node11 ~]# mongorestore -h node12:27017 --oplogReplay  --oplogLimit "1605490915:1" --drop ./alldatabase/
2020-11-16T09:51:19.658+0800	preparing collections to restore from
2020-11-16T09:51:19.668+0800	reading metadata for testdb.test from alldatabase/testdb/test.metadata.json
2020-11-16T09:51:19.693+0800	restoring testdb.test from alldatabase/testdb/test.bson
2020-11-16T09:51:19.983+0800	no indexes to restore
2020-11-16T09:51:19.983+0800	finished restoring testdb.test (16377 documents, 0 failures)
2020-11-16T09:51:19.983+0800	replaying oplog
2020-11-16T09:51:22.657+0800	oplog  537KB
2020-11-16T09:51:25.657+0800	oplog  1.12MB
2020-11-16T09:51:28.657+0800	oplog  1.72MB
2020-11-16T09:51:31.657+0800	oplog  2.32MB
2020-11-16T09:51:34.657+0800	oplog  2.92MB
2020-11-16T09:51:37.657+0800	oplog  3.51MB
2020-11-16T09:51:40.657+0800	oplog  4.11MB
2020-11-16T09:51:43.657+0800	oplog  4.71MB
2020-11-16T09:51:46.657+0800	oplog  5.30MB
2020-11-16T09:51:49.657+0800	oplog  5.90MB
2020-11-16T09:51:52.657+0800	oplog  6.46MB
2020-11-16T09:51:55.657+0800	oplog  7.04MB
2020-11-16T09:51:58.657+0800	oplog  7.61MB
2020-11-16T09:52:01.657+0800	oplog  8.20MB
2020-11-16T09:52:04.657+0800	oplog  8.77MB
2020-11-16T09:52:07.657+0800	oplog  9.36MB
2020-11-16T09:52:10.657+0800	oplog  9.96MB
2020-11-16T09:52:13.657+0800	oplog  10.6MB
2020-11-16T09:52:16.656+0800	oplog  11.2MB
2020-11-16T09:52:19.657+0800	oplog  11.8MB
2020-11-16T09:52:22.657+0800	oplog  12.4MB
2020-11-16T09:52:25.657+0800	oplog  13.0MB
2020-11-16T09:52:28.657+0800	oplog  13.6MB
2020-11-16T09:52:31.657+0800	oplog  14.2MB
2020-11-16T09:52:34.657+0800	oplog  14.8MB
2020-11-16T09:52:37.657+0800	oplog  15.4MB
2020-11-16T09:52:40.657+0800	oplog  16.0MB
2020-11-16T09:52:43.657+0800	oplog  16.6MB
2020-11-16T09:52:46.657+0800	oplog  17.2MB
2020-11-16T09:52:49.657+0800	oplog  17.8MB
2020-11-16T09:52:52.433+0800	skipping applying the config.system.sessions namespace in applyOps
2020-11-16T09:52:52.433+0800	applied 100008 oplog entries
2020-11-16T09:52:52.433+0800	oplog  18.4MB
2020-11-16T09:52:52.433+0800	16377 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# 

  提示:從上面的恢復日誌可以看到oplog恢復了100008條,備份的16377條資料也成功恢復;

  驗證:檢視testdb.test集合是否恢復?資料恢復了多少條呢?

test_replset:PRIMARY> show dbs
admin   0.000GB
config  0.000GB
local   0.010GB
testdb  0.004GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> show tables
test
test_replset:PRIMARY> db.test.count()
100000
test_replset:PRIMARY> 

  提示:可以看到testdb.test集合恢復了100000條資料;

  以上就是mongodb的備份與恢復相關話題的實踐;

相關文章