MongoDB 資料檔案損壞修復救命repair與致命危險

清風艾艾發表於2020-12-31

    最近,一客戶單例項mongodb資料庫,沒有備份的情況下遇到了斷電導致的資料檔案損壞,由於客戶業務需要

及資料的不敏感性,要求儘快恢復業務,使用了Mongdb的自動修復repair命令進行修復。可喜的是,幫助使用者盡

快恢復了服務,可悲的是在客戶可接受情況下相關資料檔案內的資料丟失。這裡,對這一過程做個總結,同時說明

repair後為什麼資料丟失。

  1. 正常的mongodb資料查詢

> show dbs;

admin       0.000GB

config      0.000GB

dns_testdb  0.009GB

local       0.000GB

> use dns_testdb

switched to db dns_testdb

> db.test_collection.find();

{ "_id" : ObjectId("5fedd03d9d2569ee04ab62e1"), "name" : "elephant", "user_id" : 0, "boolean" : false, "added_at" : ISODate("2020-12-31T13:21:01.226Z"), "number" : 5129 }

{ "_id" : ObjectId("5fedd03d9d2569ee04ab62e2"), "name" : "dog", "user_id" : 1, "boolean" : false, "added_at" : ISODate("2020-12-31T13:21:01.237Z"), "number" : 9699 }

{ "_id" : ObjectId("5fedd03d9d2569ee04ab62e3"), "name" : "lion", "user_id" : 2, "boolean" : false, "added_at" : ISODate("2020-12-31T13:21:01.238Z"), "number" : 1783 }

Type "it" for more

2.模擬資料檔案損壞

[mongo@centos7 dns_testdb]$ du -sh *

28M collection-8--6736947369024546614.wt

9.5M index-9--6736947369024546614.wt

[mongo@centos7 dns_testdb]$ 

[mongo@centos7 dns_testdb]$ 

[mongo@centos7 dns_testdb]$ pwd

/opt/mongo/data/single/dns_testdb

[mongo@centos7 dns_testdb]$ dd if=/dev/null of=/opt/mongo/data/single/dns_testdb/collection-8--6736947369024546614.wt  bs=1024k count=5

0+0 records in

0+0 records out

0 bytes (0 B) copied, 0.000132203 s, 0.0 kB/s

[mongo@centos7 dns_testdb]$

3.重新啟動mongodb

> use admin
switched to db admin
> db.shutdownServer();
[mongo@centos7 data]$ mongod --dbpath /opt/mongo/data/single --port 50001  --oplogSize 512  --fork --bind_ip 0.0.0.0 --logpath /opt/mongo/logs/single.log --logappend --journal --directoryperdb --profile=1
about to fork child process, waiting until server is ready for connections.
forked process: 102882
child process started successfully, parent exiting

4.雖然mongodb程式能啟動,但是資料檔案損壞後的資料集合做資料操作會導致mongod掛掉

[mongo@centos7 data]$ mongo --port 50001

MongoDB shell version v4.2.3

connecting to: mongodb://127.0.0.1:50001/?compressors=disabled&gssapiServiceName=mongodb

Implicit session: session { "id" : UUID("09b6c6aa-059d-4a41-9e0d-e6553966399b") }

MongoDB server version: 4.2.3

Server has startup warnings: 

> show dbs;

admin       0.000GB

config      0.000GB

dns_testdb  0.037GB

local       0.000GB

> use dns_testdb;

switched to db dns_testdb

> db.test_collection.find();

2020-12-31T08:43:45.115-0500 I  NETWORK  [js] DBClientConnection failed to receive message from 127.0.0.1:50001 - HostUnreachable: Connection closed by peer

Error: error doing query: failed: network error while attempting to run command 'find' on host '127.0.0.1:50001' 

2020-12-31T08:43:45.118-0500 I  NETWORK  [js] trying reconnect to 127.0.0.1:50001 failed

2020-12-31T08:43:45.118-0500 I  NETWORK  [js] reconnect 127.0.0.1:50001 failed failed 

5.觀察mongodb日誌,提示資料檔案損壞並建議使用repair進行修復

2020-12-31T08:43:45.103-0500 E  STORAGE  [conn1] WiredTiger error (-31802) [1609422225:103947][102882:0x7f96713b5700], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.open_cursor: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error Raw: [1609422225:103947][102882:0x7f96713b5700], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.open_cursor: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error
2020-12-31T08:43:45.104-0500 E  STORAGE  [conn1] Failed to open a WiredTiger cursor. Reason: UnknownError: -31802: WT_ERROR: non-specific WiredTiger error, uri: table:dns_testdb/collection-8--6736947369024546614, config: 
2020-12-31T08:43:45.104-0500 E  STORAGE  [conn1] This may be due to data corruption. Please read the documentation for starting MongoDB with --repair here: 
2020-12-31T08:43:45.104-0500 F  -        [conn1] Fatal Assertion 50882 at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 101
2020-12-31T08:43:45.104-0500 F  -        [conn1] 
***aborting after fassert() failure

6.按照mongod日誌就行修復資料庫

[mongo@centos7 data]$ mongod --dbpath /opt/mongo/data/single --port 50001  --oplogSize 512  --fork --bind_ip 0.0.0.0 --logpath /opt/mongo/logs/single.log --logappend --journal --directoryperdb --profile=1 --repair
about to fork child process, waiting until server is ready for connections.
forked process: 102942
child process started successfully, parent exiting
[mongo@centos7 data]$

7.修復過程中,mongod日誌提示相關損壞的資料集合及索引被重建

2020-12-31T08:44:45.646-0500 I  STORAGE  [initandlisten] repairDatabase dns_testdb
2020-12-31T08:44:45.646-0500 I  STORAGE  [initandlisten] Repairing collection dns_testdb.test_collection
2020-12-31T08:44:45.647-0500 E  STORAGE  [initandlisten] WiredTiger error (-31802) [1609422285:647413][102942:0x7fca99ec8c40], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.verify: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error Raw: [1609422285:647413][102942:0x7fca99ec8c40], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.verify: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error
2020-12-31T08:44:45.647-0500 I  STORAGE  [initandlisten] Verify failed on uri table:dns_testdb/collection-8--6736947369024546614. Running a salvage operation.
2020-12-31T08:44:45.647-0500 E  STORAGE  [initandlisten] WiredTiger error (-31802) [1609422285:647930][102942:0x7fca99ec8c40], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.salvage: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error Raw: [1609422285:647930][102942:0x7fca99ec8c40], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.salvage: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error
2020-12-31T08:44:45.648-0500 W  STORAGE  [initandlisten] Salvage failed for uri table:dns_testdb/collection-8--6736947369024546614: Salvage failed: -31802: WT_ERROR: non-specific WiredTiger error. The file will be moved out of the way and a new ident will be created.
2020-12-31T08:44:45.648-0500 W  STORAGE  [initandlisten] Moving data file /opt/mongo/data/single/dns_testdb/collection-8--6736947369024546614.wt to backup as /opt/mongo/data/single/dns_testdb/collection-8--6736947369024546614.wt.corrupt
2020-12-31T08:44:45.648-0500 W  STORAGE  [initandlisten] Rebuilding ident dns_testdb/collection-8--6736947369024546614
2020-12-31T08:44:45.708-0500 I  STORAGE  [initandlisten] Successfully re-created table:dns_testdb/collection-8--6736947369024546614.
2020-12-31T08:44:45.718-0500 I  INDEX    [initandlisten] index build: starting on dns_testdb.test_collection properties: { v: 2, key: { _id: 1 }, name: "_id_", ns: "dns_testdb.test_collection" } using method: Foreground
2020-12-31T08:44:45.718-0500 I  INDEX    [initandlisten] build may temporarily use up to 200 megabytes of RAM
2020-12-31T08:44:45.718-0500 I  STORAGE  [initandlisten] Index build initialized: 2ddee833-ea97-4964-98c0-7137e71a99c9: dns_testdb.test_collection: indexes: 1
2020-12-31T08:44:45.722-0500 I  STORAGE  [initandlisten] Index builds manager starting: 2ddee833-ea97-4964-98c0-7137e71a99c9: dns_testdb.test_collection
2020-12-31T08:44:45.724-0500 I  INDEX    [initandlisten] index build: inserted 0 keys from external sorter into index in 0 seconds
2020-12-31T08:44:45.727-0500 I  INDEX    [initandlisten] index build: done building index _id_ on ns dns_testdb.test_collection
2020-12-31T08:44:45.727-0500 I  STORAGE  [initandlisten] Index builds manager completed successfully: 2ddee833-ea97-4964-98c0-7137e71a99c9: dns_testdb.test_collection. Index specs requested: 1. Indexes in catalog before build: 1. Indexes in catalog after build: 1

8.修復後重啟mongod服務

[mongo@centos7 data]$ mongod --dbpath /opt/mongo/data/single --port 50001  --oplogSize 512  --fork --bind_ip 0.0.0.0 --logpath /opt/mongo/logs/single.log --logappend --journal --directoryperdb --profile=1 
about to fork child process, waiting until server is ready for connections.
forked process: 102975
child process started successfully, parent exiting
[mongo@centos7 data]$

9.mongod服務啟動後,服務接受正常的資料查詢,但是修復後,發生資料檔案損壞的集合資料已經丟失

[mongo@centos7 data]$ mongo --port 50001
MongoDB shell version v4.2.3
connecting to: mongodb://127.0.0.1:50001/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("d88894c4-16bf-4013-a993-d29e2493fbdf") }
MongoDB server version: 4.2.3
Server has startup warnings: 
> show dbs;
admin       0.000GB
config      0.000GB
dns_testdb  0.000GB
local       0.000GB
> use dns_testdb;
switched to db dns_testdb
> db.test_collection.find();
>

10.總結

    mongodb資料庫修復命令repair,在無備份且發生資料檔案損壞的情況下,會導致損壞資料檔案相關集合資料全部丟

失,但是修復後不妨礙mongod服務的正常啟動。結合修改過程的日誌,不難看出,repair對損壞的資料檔案及相關集合

的索引檔案進行了重建,重建後的資料檔案和集合檔案被重新初始化,因此資料丟失。所以,使用mongodb資料庫,最

好合理配合使用mongodb的副本集做資料冗餘安全策略,在使用mongodb副本集的同時還可以做個延遲同步節點防止

誤操作。




來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29357786/viewspace-2746973/,如需轉載,請註明出處,否則將追究法律責任。

相關文章