SQL Server database mail問題診斷一例

宅慕思_發表於2019-10-31

產品環境sql server database的mail發不出郵件,影響客戶的業務,在資料庫中進行診斷

診斷sql:

EXEC msdb.dbo.sp_send_dbmail

@profile_name = 'DB Messaging',

@recipients = 'jamesz@gmail.com',

@body = 'James_test_2019',

@body_format = 'HTML',

@subject = 'James_test_2019'

執行過後並沒有收到郵件,查詢event log表,也沒有任何記錄:

SELECT * FROM msdb.dbo.sysmail_event_log

ORDER BY log_id DESC


檢查郵件的狀態,都是unsent:

SELECT top 100 * FROM msdb.dbo.sysmail_allitems 

ORDER BY send_request_date DESC


根據以下連結的建議進行診斷:


所有診斷項均正常。


同時檢查sql server errorlog, windows event log和windows cluster log,發現週末出現的storage異常和mail功能失效關聯端倪

The brief analysis for mail can't sent problem:

sql errorlog顯示sql instance的恢復啟動時間是 2019-10-25 11:04:50.050 CDT

但是在 windows cluster log,  clussvc.exe 在時間段 2019/10/25-07:54:02.903 to 2019/10/25-16:04:18.499並沒有記錄

恰好在這段時間,storage 儲存異常

之後sql instance在cluster service有問題的情況下啟動工作

並且獲取了本地local的資訊而不是cluster service的資訊.

windows event log顯示,database mail透過連線. 到db而出現錯誤證明上述猜測

Server Name: ., Database Name: msdb


解決方案就是重啟備用節點後,做一個乾淨狀態的failover,讓sql instance在正常的cluster service基礎之上。


 SQL Server Errorlog:

2019-10-25 11:04:50.050 Server Microsoft SQL Server 2014 (SP3-CU2) (KB4482960) - 12.0.6214.1 (X64)    Feb  2 2019 01:10:18    Copyright (c) Microsoft Corporation   Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.3 <X64> (Build 9600: )  

2019-10-25 11:04:50.050 Server UTC adjustment: -5:00

2019-10-25 11:04:50.050 Server (c) Microsoft Corporation.

2019-10-25 11:04:50.050 Server All rights reserved.

2019-10-25 11:04:50.050 Server Server process ID is 5700.

2019-10-25 11:04:50.050 Server System Manufacturer: 'Cisco Systems Inc', System Model: 'UCSB-B200-M3'.

2019-10-25 11:04:50.060 Server Authentication mode is MIXED.

2019-10-25 11:04:50.060 Server Logging SQL Server messages in file 'D:\MSSQL12.MSSQLSERVER\MSSQL\Log\ERRORLOG'.

2019-10-25 11:04:50.060 Server The service account is 'TAN\svc.DBAHFDN01VS.sql'. This is an informational message; no user action is required.

2019-10-25 11:04:50.060 Server Registry startup parameters:     -d D:\MSSQL12.MSSQLSERVER\MSSQL\DATA\master.mdf    -e D:\MSSQL12.MSSQLSERVER\MSSQL\Log\ERRORLOG    -l D:\MSSQL12.MSSQLSERVER\MSSQL\DATA\mastlog.ldf    -T 3226

2019-10-25 11:04:50.060 Server Command Line Startup Parameters:    -s "MSSQLSERVER"

2019-10-25 11:04:50.450 Server SQL Server detected 2 sockets with 10 cores per socket and 20 logical processors per socket, 40 total logical processors; using 40 logical processors based on SQL Server licensing. This is an informational message; no user action is required.

2019-10-25 11:04:50.450 Server SQL Server is starting at normal priority base (=7). This is an informational message only. No user action is required.

2019-10-25 11:04:52.180 spid9s The NETBIOS name of the local node that is running the server is 'DBAHFDN01N1'. This is an informational message only. No user action is required.



 cluster log:

00002750.0000251c::2019/10/25-07:54:02.903 DBG   [RCM] rcm::RcmResource::Control: (QUORUM_NEW, SWITCH_MONITORS)

00002750.00000ed8::2019/10/25-07:54:02.903 DBG   [RCM] QUORUM_NEW: Ignoring reset because resource flag 2 is not set.

00002750.00000ed8::2019/10/25-07:54:02.903 DBG   [RCM] QUORUM_NEW: Ignoring reset because resource flag 32 is not set.

00002750.0000251c::2019/10/25-07:54:02.903 DBG   [RCM] Switch monitor call for QUORUM_NEW when there were pending controls, enqueuing the switch

00002750.0000251c::2019/10/25-07:54:02.903 DBG   [RCM] Pending controls for QUORUM_NEW are empty now but we're not in a state that can switch monitors; removing SWITCH_MONITORS control.

00002750.00000ed8::2019/10/25-07:54:02.903 DBG   [RCM] rcm::RcmGroup::ComputeFailoverThreshold=> (Cluster Group, 0, computed)

00002750.00000ed8::2019/10/25-07:54:02.903 DBG   [RCM] rcm::RcmResource::StmPreOnlineCheck(QUORUM_NEW)

00000b4c.00000b50::2019/10/25-16:04:18.499 INFO  -----------------------------+ LOG BEGIN +-----------------------------

00000b4c.00000b50::2019/10/25-16:04:18.499 INFO  [CS] Starting clussvc as a service

00000b4c.00000b50::2019/10/25-16:04:18.499 INFO  [CS] cluster service logging level is 5

00000b4c.000009cc::2019/10/25-16:04:18.546 INFO  [CS] Creating cluster node <vector len='1'>

00000b4c.000009cc::2019/10/25-16:04:18.546 INFO      <item>ClusSvc</item>

00000b4c.000009cc::2019/10/25-16:04:18.546 INFO  </vector>

00000b4c.000009bc::2019/10/25-16:04:18.546 INFO  [StartupConfig]: Initializing.


















Windwos event log:

10/30/2019 12:05:03 PM



1) Exception Information

===================

Exception Type: Microsoft.SqlServer.Management.SqlIMail.Server.Common.BaseException

Message: There was an error on the connection. Reason: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server), connection parameters: Server Name: ., Database Name: msdb

Data: System.Collections.ListDictionaryInternal

TargetSite: Void OpenConnection(Microsoft.SqlServer.Management.Common.SqlConnectionInfo)

HelpLink: NULL

Source: DatabaseMailEngine


StackTrace Information

===================

   at Microsoft.SqlServer.Management.SqlIMail.Server.DataAccess.ConnectionManager.OpenConnection(SqlConnectionInfo connectionInfo)

   at Microsoft.SqlServer.Management.SqlIMail.Server.DataAccess.DataAccessAdapter.OpenConnection(String dbServerName, String dbName, String userName, String password, String appName, Int32 connectionTimeout)

   at Microsoft.SqlServer.Management.SqlIMail.Server.DataAccess.DataAccessAdapter.OpenConnection(String dbServerName, String dbName, String userName, String password, Int32 connectionTimeout)

   at Microsoft.SqlServer.Management.SqlIMail.IMailProcess.QueueItemProcesser.ProcessQueueItems(String dbName, String dbServerName, Int32 lifetimeMinimumSec, LogLevel loggingLevel, Byte[] encryptionKey, Int32 connectionTimeout)


2) Exception Information

===================

Exception Type: System.Data.SqlClient.SqlException

Errors: System.Data.SqlClient.SqlErrorCollection

Class: 20

LineNumber: 0

Number: 2

Procedure: NULL

Server: 

State: 0

Source: .Net SqlClient Data Provider

ErrorCode: -2146232060

Message: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)

Data: System.Collections.ListDictionaryInternal

TargetSite: System.Data.ProviderBase.DbConnectionInternal GetConnection(System.Data.Common.DbConnection)

HelpLink: NULL


StackTrace Information

===================

   at System.Data.ProviderBase.DbConnectionPool.GetConnection(DbConnection owningObject)

   at System.Data.ProviderBase.DbConnectionFactory.GetConnection(DbConnection owningConnection)

   at System.Data.ProviderBase.DbConnectionClosed.OpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory)

   at System.Data.SqlClient.SqlConnection.Open()

   at Microsoft.SqlServer.Management.SqlIMail.Server.DataAccess.ConnectionManager.OpenConnection(SqlConnectionInfo connectionInfo)


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/69950462/viewspace-2662172/,如需轉載,請註明出處,否則將追究法律責任。

相關文章