VSSADMIN命令無法列出SQL writer的問題

apgcdsd發表於2011-08-15

我們通常有兩種方式來備份SQL Server的資料庫。一種是通過在SQL Server中執行BACKUP DATABASET-SQL語句來備份資料庫。另一種方法是使用SQL writer service來備份資料庫。SQL writer service提供了一種通過Volume Shadow Copy Service (VSS)框架來備份SQL Server資料庫的途徑。第三方廠商可以開發自己的備份應用程式,通過呼叫SQL writer service來實現SQL資料庫的備份。微軟自己的一些產品,比如Data Protection Manager (DPM)也是使用的SQL writer service來做備份的。

關於SQL Writer serviceVSS的更多內容,可以參考:

http://technet.microsoft.com/en-us/library/cc966520.aspx

http://msdn.microsoft.com/en-us/library/ms175536.aspx

 

當我們遇到SQL writer service無法正常工作的問題時,我們通過會通過以下命令來對SQL writer service做一個快速而簡單的測試:

vssadmin list writers

該命令會列出伺服器上所有使用VSSwriters,如果SQL writer service無法被列出,那麼自然證明SQL writer serviceVSS之間的協同工作有問題。那麼SQL writer service無法正常工作也在情理之中了。

要進一步研究為什麼SQL writer service無法被列出,一個很有效的方法是使用SQL writertrace。開啟trace步驟如下:

(1). 以下步驟都在SQL Server上執行 
(2).
備份登錄檔鍵值“HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\SQLWriter\”. 
(3). 
向登錄檔中新增如下鍵值:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\SQLWriter\Debug\Tracing]

"TraceEnterExit"=dword:00000001

"TraceFile"="c:\\trace.txt"

"TraceFileLineInfo"=dword:00000001

"TraceForceFlush"=dword:00000001

"TraceLevel"=dword:ffffffff

"TraceTimestamp"=dword:00000001

"TraceToDebugger"=dword:00000001

"TraceToFile"=dword:00000001


(4). 
重啟SQL Writer Service。此時我們配置的trace就會開始執行。

(5). 開啟命令列,執行:

vssadmin list writers

(6). 執行完成後,開啟C:\trace.txt檔案來檢視是否有任何錯誤。如果需要關閉trace,只要還原之前備份的鍵值,並且重啟SQL writer service即可。

 

根據經驗,大多數SQL writer service無法被列出的問題是由以下兩種問題造成的:

1      VSSADMIN嘗試列出SQL writer service時,他們嘗試去連線伺服器上所有SQL Server例項,一旦有一個例項無法連線上,則SQL writer service就無法被列出。

2      VSSADMIN連線上SQL Server後,他會執行一句查詢語句,如果這句查詢語句失敗,則SQL writer service就無法被列出。

這兩個問題都可以在trace檔案裡發現。以下是一個trace檔案的示例。

---------------------------------------------------------------------------------

[0082394203,0x0008b0:0x1c10:0xfc9d47b3] sqlwriter.yukon\sqllib\sqlconnect.cpp(0589): SqlConnection::Connect: Connecting to server A000S-ITSQL05...

[0082394234,0x0008b0:0x1c10:0xfc9d47b3] sqlwriter.yukon\sqllib\sqlconnect.cpp(0630): SqlConnection::Connect: Connected

[0082394250,0x0008b0:0x1c10:0xfc9d47b3] sqlwriter.yukon\sqllib\sqlconnect.cpp(0652): SqlConnection::Connect: Version: 09.00.3042

[0082394265,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlConnection::Connect}: hr: 0x00000000

[0082394281,0x0008b0:0x1c10:0xfc9d47b3] ENTER {SqlConnection::SetCommand}:

[0082394296,0x0008b0:0x1c10:0xfc9d47b3] ENTER {SqlConnection::ReleaseRowset}:

[0082394312,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlConnection::ReleaseRowset}: hr: 0x00000000

[0082394328,0x0008b0:0x1c10:0xfc9d47b3] sqlwriter.yukon\sqllib\sqlconnect.cpp(0916): SqlConnection::SetCommand: SetCommand (select rtrim(physical_name),rtrim(type_desc),rtrim(state_desc), is_name_reserved from sys.master_files where DB_ID(N'RFTHSOWMS ') = database_id)

[0082394359,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlConnection::SetCommand}: hr: 0x00000000

[0082394375,0x0008b0:0x1c10:0xfc9d47b3] ENTER {SqlConnection::ExecCommand}:

[0082394390,0x0008b0:0x1c10:0xfc9d47b3] ENTER {SqlConnection::ReleaseRowset}:

[0082394406,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlConnection::ReleaseRowset}: hr: 0x00000000

[0082394453,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlConnection::ExecCommand}: hr: 0x00000000

[0082394468,0x0008b0:0x1c10:0xfc9d47b3] ENTER {SqlConnection::FetchFirst}:

[0082394484,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlConnection::FetchFirst}: hr: 0x00000000

[0082394500,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlEnumerator::FirstFile}: hr: 0x00000000

[0082394531,0x0008b0:0x1c10:0xfc9d47b3] ENTER {SqlEnumerator::NextFile}:

[0082394546,0x0008b0:0x1c10:0xfc9d47b3] ENTER {SqlConnection::FetchNext}:

[0082394578,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlConnection::FetchNext}: hr: 0x00000000

[0082394593,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlEnumerator::NextFile}: hr: 0x00000000

[0082394640,0x0008b0:0x1c10:0xfc9d47b3] ENTER {SqlEnumerator::NextFile}:

[0082394656,0x0008b0:0x1c10:0xfc9d47b3] ENTER {SqlConnection::FetchNext}:

[0082394687,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlConnection::FetchNext}: hr: 0x00040ec6

[0082394718,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlEnumerator::NextFile}: hr: 0x00040ec6

[0082394750,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {CSqlWriter::CheckFilesBeforeAddComponent}: hr: 0x00040ec6

[0082394765,0x0008b0:0x1c10:0xfc9d47b3] ENTER {CSqlWriter::AddComponentAndFiles}:

[0082397921,0x0008b0:0x1c10:0xfc9d47b3] ENTER {SqlEnumerator::FirstFile}:

[0082397937,0x0008b0:0x1c10:0xfc9d47b3] ENTER {SqlConnection::FetchFirst}:

[0082397968,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlConnection::FetchFirst}: hr: 0x00000000

[0082397984,0x0008b0:0x1c10:0xfc9d47b3] EXIT  {SqlEnumerator::FirstFile}: hr: 0x00000000

[0082401375,0x0008b0:0x1c10:0xfc9d47b3] sqlwriter.yukon\sqlwriter\sqlwriter.cpp(2233): CSqlWriter::AddComponentAndFiles: IVssCreateWriterMetadata::AddFilesToFileGroup failed.  hr = 0x80042308

---------------------------------------------------------------------------------

 

首先我們可以看到SQL Server例項已經被成功連線到了(SqlConnection::Connect: Connected)。如果這裡顯示連線失敗的話,我們可以到SQL errorlog裡檢視是否在對應時間裡有login failed的錯誤(error 18456)被記錄。如果有的話,我們就可以做進一步工作來檢查為什麼會有這樣的連線失敗產生。

連線成功後,VSSADMIN就開始執行語句:

SqlConnection::SetCommand: SetCommand (select rtrim(physical_name),rtrim(type_desc),rtrim(state_desc), is_name_reserved from sys.master_files where DB_ID(N'RFTHSOWMS ') = database_id

 

在上面我們給出的例項中,這句語句的執行失敗了。於是導致了VSSADMIN無法列出SQL writer service

為什麼查詢會失敗呢?該查詢語句中,資料庫名“RFTHSOWMS ”在末尾處有一個空格。就是這個空格導致了查詢的時候。事實上,只要資料庫名的開頭或者末尾有空格,又或是資料庫名包含列印不出的字元的話,就會導致查詢失敗。關於這個問題,在微軟的知識庫文章中已經有講到過,大家有興趣的話可以去看看這篇文章:

http://support.microsoft.com/default.aspx?scid=kb;en-US;2014054

 

SQL writer servicetrace是非常有用的。它不但可以發現vssadmin list writers時產生的問題,對於一些在使用SQL writer或者MSDE writer備份資料庫的過程中發生的問題也能提供非常有用的資訊。(MSDE writerSQL Server 2000的時候SQL使用VSS框架的writer)

 

我以前遇到過一個使用MSDE writer備份資料庫報錯的問題。我使用了trace來排查這個問題。開啟MSDE writertrace的方法和SQL writer service類似,只是登錄檔的鍵值不同。對於MSDE writer,我們要新增的鍵值是:

 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\VSS\Debug\Tracing]
"TraceFile"="c:\\trace.txt"
"TraceLevel"=dword:ffffffff
"TraceEnterExit"=dword:00000001
"TraceToFile"=dword:00000001
"TraceToDebugger"=dword:00000000
"TraceFileLineInfo"=dword:00000001
"TraceForceFlush"=dword:00000000

 

新增完後,重啟MSDE writer service使trace生效。然後執行backup軟體來重現問題。然後檢視Trace檔案,發現:

 ---------------------------------------------------------------------------------

[3012184656,0x0011f8:0x1290:0x5d93de09] modules\sqllib\sqlconnect.cpp(0198): DumpErrorInfo: Error state: 2

Severity:              14

[3012184656,0x0011f8:0x1568:0x5d93de09] modules\sqllib\sqlconnect.cpp(0239): DumpErrorInfo: Error message:                Database 'ISALOG_20081021_FWS_000' cannot be opened due to inaccessible files or insufficient memory or disk space.  See the SQL Server errorlog for details.

[3012184671,0x0011f8:0x131c:0x5d93de09] modules\sqllib\sqlconnect.cpp(0605): SqlConnection::SetCommand: SetCommand (BACKUP DATABASE [ISALOG_20100512_FWS_000] TO VIRTUAL_DEVICE='{D431F72B-C2B1-49C0-9B80-01D7B2095BF9}9' WITH SNAPSHOT,BUFFERCOUNT=1,BLOCKSIZE=1024)

[3012184671,0x0011f8:0x15a8:0x5d93de09] ENTER {SqlConnection::ExecCommand}:

[3012184671,0x0011f8:0x0cc0:0x5d93de09] EXIT  {SqlConnection::ReleaseRowset}: hr: 0x00000000

[3012184671,0x0011f8:0x0a9c:0x5d93de09] modules\sqllib\sqlconnect.cpp(0151): DumpErrorInfo: HRESULT:     0X80040E14

[3012184687,0x0011f8:0x0b7c:0x5d93de09] modules\sqllib\sqlconnect.cpp(0198): DumpErrorInfo: Error state: 1

Severity:              16

[3012184687,0x0011f8:0x164c:0x5d93de09] modules\sqllib\sqlconnect.cpp(0239): DumpErrorInfo: Error message:                BACKUP DATABASE is terminating abnormally.

[3012184687,0x0011f8:0x0b98:0x5d93de09] modules\sqllib\sqlconnect.cpp(0158): DumpErrorInfo: SQLSTATE:  42000

---------------------------------------------------------------------------------

 

從錯誤上看問題應該是發生在SQL Server內部。於是看了SQL ServererrorlogErrorlog裡在相應的時間裡記載瞭如下資訊:

2010-08-05 18:30:33.51 spid10    This SQL Server has been optimized for 8 concurrent queries. This limit has been exceeded by 10 queries and performance may be adversely affected.

2010-08-05 18:30:34.65 spid63    BackupMedium::ReportIoError: write failure on backup device '{D431F72B-C2B1-49C0-9B80-01D7B2095BF9}9'. Operating system error 995(error not found).

2010-08-05 18:30:34.65 spid63    Internal I/O request 0x43333C28: Op: Write, pBuffer: 0x04430400, Size: 1024, Position: 0, UMS: Internal: 0x0, InternalHigh: 0x0, Offset: 0x10001, OffsetHigh: 0x0, m_buf: 0x00030001, m_len: 0, m_actualBytes: 0, m_errcode: 995, BackupFile: {D431F72B-C2B1-49C0-9B80-01D7B2095BF9}9

2010-08-05 18:30:34.65 backup    BACKUP failed to complete the command BACKUP DATABASE [ISALOG_20100512_FWS_000] TO VIRTUAL_DEVICE='{D431F72B-C2B1-49C0-9B80-01D7B2095BF9}9' WITH SNAPSHOT,BUFFERCOUNT=1,BLOCKSIZE=1024

原來客戶使用的SQL ServerMSDE的版本。MSDE版本的SQL Server限制最大工作執行緒數是8個。但是客戶使用了備份工具同時備份了多個資料庫。每個資料庫的備份都佔用了工作執行緒,最後導致有部分資料庫備份的請求由於得不到工作執行緒而失敗。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/25175503/viewspace-704867/,如需轉載,請註明出處,否則將追究法律責任。

相關文章