SQL Server 資料頁損壞修復

www88xbocom17176934555x發表於2021-03-17

當我們發現資料庫資料頁損壞了,或者執行 DBCC CHECKDB 發現有損壞的資料頁時,大部分人都執行如下操作進行修復。

ALTER DATABASE db_name SET EMERGENCY;
DBCC CHECKDB (‘db_name’);
ALTER DATABASE db_name SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
–{ REPAIR_ALLOW_DATA_LOSS | REPAIR_FAST | REPAIR_REBUILD }
DBCC CHECKDB (‘db_name’, REPAIR_ALLOW_DATA_LOSS);
ALTER DATABASE db_name SET MULTI_USER;
不過,上面的修復可能不成功,也可能使資料丟失。因為資料頁的損壞也分多鐘情況,如日誌損壞、索引損壞、資料損壞、系統物件損壞等。現在我們考慮的是使用者庫資料損壞的情況修復。

為了能進行資料修復,資料庫須使用完整模式,先進行一次完整備份。

ALTER DATABASE [TestDBSubA] SET RECOVERY FULL WITH NO_WAIT
BACKUP DATABASE [TestDBSubA] TO DISK = N’E:\DatabaseFile\Backup\TestDBSubA.bak

我們先任意找一個資料頁(如頁ID=179)進行寫入破壞。更改的偏移量為100(96頁頭+前4個資料字元),替換了10個字元

–找一個資料頁
DBCC TRACEON(3604,-1)
DBCC IND(TestDBSubA,Test,-1)
DBCC PAGE(‘TestDBSubA’, 1, 179,3)

–破壞該資料頁
DBCC WRITEPAGE(‘TestDBSubA’, 1, 179, 100, 10, 0x65656565656565656565)

–檢查DB
DBCC CHECKDB(‘TestDBSubA’)
DBCC results for ‘TestDBSubA’.

…………(此處省略)

DBCC results for ‘sys.syssoftobjrefs’.

There are 4 rows in 1 pages for object “sys.syssoftobjrefs”.

Msg 8933, Level 16, State 1, Line 1

Table error: Object ID 1019150676, index ID 1, partition ID 72057594051100672, alloc unit ID 72057594059948032 (type In-row data).

The low key value on page (1:179) (level 0) is not >= the key value in the parent (1:431) slot 6.

…………(此處省略)

CHECKDB found 0 allocation errors and 1 consistency errors in database ‘TestDBSubA’.

repair_rebuild is the minimum repair level for the errors found by DBCC CHECKDB (TestDBSubA).

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

此時我們再看看該資料頁儲存的資訊。

DBCC PAGE(‘TestDBSubA’, 1, 179,3)
圖片

頁面內容有10個字元被替換了,十六進位制為 65,轉換十進位制為 101即為ASCII值,對應的字元為字母 e。這10個字元改動的,為該表該行欄位 GUID 的部分值。也就是說,該行的欄位GUID資料丟失了!

現在使用修改後的值操作該行資料,更改時發生錯誤。

–檢視該行記錄,正常
SELECT [GUID],[SID],[NAME],[VALUE]
FROM [TestDBSubA].[dbo].[Test]
WHERE GUID=’65656565-6565-6565-6565-005056c00008’

–第一列[SID]被writepage更改了,所以此時更改[SID]將報錯!
UPDATE T SET [SID]=SUSER_SID()
FROM [TestDBSubA].[dbo].[Test] T
WHERE GUID=’65656565-6565-6565-6565-005056c00008’
Msg 8646, Level 21, State 1, Line 1

Unable to find index entry in index ID 1, of table 1019150676, in database ‘TestDBSubA’.

The indicated index is corrupt or there is a problem with the current update plan.

Run DBCC CHECKDB or DBCC CHECKTABLE. If the problem persists, contact product support.

Msg 0, Level 20, State 0, Line 0

當前命令發生了嚴重錯誤。應放棄任何可能產生的結果。

當發生問題時,我們沒能及時發現和修復,其他表或資料又有新的操作,我們模擬如下。

DELETE TOP(10) FROM [TestDBSubA].[dbo].[Test]
WHERE GUID<>’65656565-6565-6565-6565-005056c00008’
GO
UPDATE T SET VALUE=100 FROM [TestDBSubA].[dbo].[Test] T
GO

那麼該如何修復呢?

修復之前,我們要習慣進行一次日誌備份。

BACKUP LOG [TestDBSubA] TO DISK = N’E:\DatabaseFile\Backup\TestDBSubA_LOG.trn’

剛開始之前,我們有進行過一次完整備份。那時的完整備份的資料還沒有損壞,所有我們可以用最近的完整備份進行某個資料頁的修復。

RESTORE DATABASE [TestDBSubA]
PAGE = ‘1:179’
FROM DISK = ‘E:\DatabaseFile\Backup\TestDBSubA.bak’
WITH NORECOVERY
Processed 1 pages for database ‘TestDBSubA’, file ‘TestPub’ on file 1.

RESTORE DATABASE … FILE= successfully processed 1 pages in 0.072 seconds (0.108 MB/sec).

接下來,我們需要還原剛剛備份的事務日誌,將資料還原到最近狀態。

RESTORE LOG [TestDBSubA]
FROM DISK = ‘E:\DatabaseFile\Backup\TestDBSubA_LOG.trn’
WITH RECOVERY;
Processed 0 pages for database ‘TestDBSubA’, file ‘TestPub’ on file 1.

The roll forward start point is now at log sequence number (LSN) 597000000036800001.

Additional roll forward past LSN 597000000038800001 is required to complete the restore sequence.

RESTORE LOG successfully processed 0 pages in 0.035 seconds (0.000 MB/sec).

此時再進行對該表操作,發現2個語句都報錯了!是不是有些慌?

UPDATE T SET [SID]=SUSER_SID()
FROM [TestDBSubA].[dbo].[Test] T
WHERE GUID=’65656565-6565-6565-6565-005056c00008’
GO
SELECT [GUID],[SID],[NAME],[VALUE]
FROM [TestDBSubA].[dbo].[Test]
GO
Msg 829, Level 21, State 1, Line 1

Database ID 7, Page (1:179) is marked RestorePending, which may indicate disk corruption.

To recover from this state, perform a restore.

檢查資料庫,仍然報錯。

DBCC CHECKDB(‘TestDBSubA’)
Msg 8939, Level 16, State 98, Line 1

Table error: Object ID 1019150676, index ID 1, partition ID 72057594051100672, alloc unit ID 72057594059948032 (type In-row data), page (1:179).

Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 133129 and -6.

Msg 8928, Level 16, State 1, Line 1

Object ID 1019150676, index ID 1, partition ID 72057594051100672, alloc unit ID 72057594059948032 (type In-row data): Page (1:179) could not be processed.

See other errors for details.

Msg 8978, Level 16, State 1, Line 1

Table error: Object ID 1019150676, index ID 1, partition ID 72057594051100672, alloc unit ID 72057594059948032 (type In-row data).

Page (1:419) is missing a reference from previous page (1:179). Possible chain linkage problem.

Msg 8976, Level 16, State 1, Line 1

Table error: Object ID 1019150676, index ID 1, partition ID 72057594051100672, alloc unit ID 72057594059948032 (type In-row data).

Page (1:179) was not seen in the scan although its parent (1:431) and previous (1:420) refer to it. Check any previous errors.

怎麼解決呢?很簡單,再進行一次事務日誌的備份和還原即可!

USE MASTER
GO
BACKUP LOG [TestDBSubA]
TO DISK = N’E:\DatabaseFile\Backup\TestDBSub_LOG.trn’
WITH INIT,FORMAT
GO
RESTORE LOG [TestDBSubA]
FROM DISK = ‘E:\DatabaseFile\Backup\TestDBSub_LOG.trn’
WITH RECOVERY;
GO

本作品採用《CC 協議》,轉載必須註明作者和本文連結

相關文章