SQLServer mirror當機後error 9004異常處理

guocun09發表於2020-05-07

異常:

一臺SQLServer 2008SP4 mirror server因為硬體問題當機,修復重啟後就無法與principal主庫連上同步了


mirror error log中報錯:

Date  2020/5/2 上午 02:15:20
Log  SQL Server (Archive #3 - 2020/5/4 上午 11:55:00)
Source  spid17s
Message
Database mirroring will be suspended. Server instance 'SMESDBSTY' encountered error 9004, state 2, severity 21 when it was acting as a mirroring partner for database 'MESDB'. The database mirroring partners might try to recover automatically from the error and resume the mirroring session. For more information, view the error log for additional error messages.


Date  2020/5/2 上午 02:15:20
Log  SQL Server (Archive #3 - 2020/5/4 上午 11:55:00)

Source  spid17s
Message
An error occurred while processing the log for database 'MESDB'.  If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.


principal(主庫) error log 中報錯:

Date  2020/5/4 下午 02:04:21
Log  SQL Server (Current - 2020/5/4 下午 05:45:00)
Source  spid19s

Message
'TCP://10.209.95.203:5022', the remote mirroring partner for database 'MESDB', encountered error 9004, status 2, severity 21. Database mirroring has been suspended.  Resolve the error on the remote server and resume mirroring, or remove mirroring and re-establish the mirror server instance.


Date  2020/5/4 下午 02:04:21
Log  SQL Server (Current - 2020/5/4 下午 05:45:00)
Source  spid19s
Message
Error: 1453, Severity: 16, State: 1.


處理:

嘗試重啟備庫mirror server後依然無法和主庫同步,只能從log分析,看到principal和mirror端log中都有error 9004報錯估計和這個有關, Resolve the error on the remote server and resume mirroring, or remove mirroring and re-establish the mirror server instance. 錯誤提示中的解決方法說的比較籠統,解決error可恢復mirror或者移除mirror重建。


解決error 9004似乎無從下手,嘗試移除mirror partner關係再重新建立mirror關係還是不行,重新備份恢復建立的話工作又太大了。。


只能在回到error 9004錯誤中找答案,查到官方一篇9004錯誤說明:


Symptoms


An operation in SQL Server that needs to read or process the transaction log can fail with an error like the following if the transaction log is damaged:

Error: 9004, Severity: 21, State: 1.
An error occurred while processing the log for database 'mydb'.  If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.

The State number can vary for this error and indicates what type of damage has occurred with the log. See the More Information section about State numbers.

In most cases, this error is just seen in the ERRORLOG or Windows Application Event Log with EventID = 9004 because the operation processing the log is not based on a direct user command (such as recovery running when the SQL Server Engine starts. In these situations this error is often seen with Error 3414). However, some queries such as ALTER DATABASE could require a processing of the log and therefore will see these errors. Since the error is Severity=21, the user session is disconnected.

Cause


Error 9004 is a general error indicating the contents of the transaction log are damaged. The reason for the log to become inconsistent are similar to any database corruption problem detected by the SQL Server Engine or DBCC CHECKDB. To find the cause for the damage of the log you should follow the similar techniques for database corruption including an analysis of possible hardware, filesystem, and/or I/O problems. See the Cause section of the following article for more information: .

Resolution


You should restore from a known good backup to recover from this problem. It is possible that if the transaction log portion of a database backup or the transaction log backup itself has damaged transaction log contents, you can encounter an Error 9004 on RESTORE. In this situation, the transaction log in the backup is damaged.

If you cannot restore from a backup, you may be able to bring the database online by rebuilding the transaction log. You should carefully understand the ramifications of rebuilding the transaction log including the possible loss of transactional consistency in your database. To read about how to rebuild the transaction log, please see the section titled Resolving Database Errors in Emergency Mode in the SQL Server Books Online under the command

More Information


The SQL Server Engine performs logical checks on the consistency of the transaction log contents as it reads and processes it. Not all aspects of the log header, log blocks, and log records are checked. The State number provides more information on what type of failure was encountered when processing the transaction log:

  • State 1 = The log file header of the Virtual Log File (VLF) was damaged.  If a damaged log file header is encountered as part of starting up the database on service startup, you may only see Error 9004 in the ERRORLOG. The log file header is the first portion of each VLF in the log file. This is not the same as the file header or the first 8KB of the log file. If the file header of the the log file is damaged you will encounter Msg 5172 as with a database file header page corruption.
  • State 2 and 3 = A log block was invalid when performing recovery during RESTORE
  • State 4 through 12 = These are all various checks on log blocks when processing log records. These including parity, sector, and other logical checks on the consistency of the transaction log

 

從文件中看,應該是mirror DB transaction log損壞導致不一致問題,而tran log的損壞也很可能與這此server硬體當機有關,再次詳細查詢mirror啟動時的error log發現果然是transaction log損壞導致

Date  2020/5/2 上午 02:15:20
Log  SQL Server (Archive #3 - 2020/5/4 上午 11:55:00)
Source  spid17s
Message
An error occurred while processing the log for database 'MESDB'.  If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.


tran log損壞如何修復呢?

因為主庫中有設定每個15分鐘把tran log備份出來的job (full recovery mode下以免log過大不能重用的做法),

理論上講透過把主庫備份出來的日誌到mirror端恢復應該就可以了,

劇吐操作:

1. 移除主備mirror partner關係,mirror端中執行:

alter database MESDB set partner off

2.copy principal主庫中的異常當天及之後產生的tran log到mirror端

3.透過以下執行結果可批次運用恢復tranlog(如果語法有報錯可以去掉go再試)

select 'RESTORE LOG [MESDB] FROM  DISK = N'''
 + physical_device_name +''' WITH  FILE = 1,  NORECOVERY,  NOUNLOAD,  STATS = 10 '+char(13)+' go',backup_set_id,a.type,physical_device_name 
 from msdb.dbo.backupset a ,msdb.dbo.backupmediafamily as b 
where a.media_set_id=b.media_set_id 
and a.database_name='MESDB' 
and a.backup_start_date >'2020-05-02 12:00:00'

4.重新建立mirror partner關係後,principal 和mirror端可以正常同步資料了


此時, mirror error 9004異常處理完成

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/25583515/viewspace-2690442/,如需轉載,請註明出處,否則將追究法律責任。