問題描述
Azure APIM服務日誌中發現 java.lang.RuntimeException 錯誤,在進一步透過Application Insights採集的錯誤資訊日誌,發現真實的請求錯誤為:‘The remote name could not be resolved 'xxxx.xxx.xx'"。
問題解答
APIM服務,在沒有配置自定義的DNS伺服器時,預設會使用Azure平臺的DNS伺服器(168.63.129.16)進行解析。
Azure APIM服務所託管的虛擬機器作業系統為Windows,在遇到多個DNS Server時的選擇順序如下:
The DNS Client service queries the DNS servers in the following order:
DNS 客戶端服務按以下順序查詢 DNS 伺服器:
The DNS Client service sends the name query to the first DNS server on the preferred adapter’s list of DNS servers and waits one second for a response.
DNS 客戶端服務將名稱查詢傳送到首選介面卡的 DNS 伺服器列表中的第一個 DNS 伺服器,並等待一秒鐘以獲取響應。If the DNS Client service does not receive a response from the first DNS server within one second, it sends the name query to the first DNS servers on all adapters that are still under consideration and waits two seconds for a response.
如果 DNS 客戶端服務在一秒鐘內未收到來自第一個 DNS 伺服器的響應,則會將名稱查詢傳送到仍在考慮中的所有介面卡上的第一個 DNS 伺服器,並等待兩秒以獲取響應。If the DNS Client service does not receive a response from any DNS server within two seconds, the DNS Client service sends the query to all DNS servers on all adapters that are still under consideration and waits another two seconds for a response.
如果 DNS 客戶端服務在兩秒內未收到任何 DNS 伺服器的響應,則 DNS 客戶端服務會將查詢傳送到仍在考慮的所有介面卡上的所有 DNS 伺服器,並再等待兩秒以獲得響應。If the DNS Client service still does not receive a response from any DNS server, it sends the name query to all DNS servers on all adapters that are still under consideration and waits four seconds for a response.
如果 DNS 客戶端服務仍未收到任何 DNS 伺服器的響應,它將名稱查詢傳送到仍在考慮中的所有介面卡上的所有 DNS 伺服器,並等待四秒鐘以獲取響應。If it the DNS Client service does not receive a response from any DNS server, the DNS client sends the query to all DNS servers on all adapters that are still under consideration and waits eight seconds for a response.
如果 DNS 客戶端服務未收到來自任何 DNS 伺服器的響應,則 DNS 客戶端會將查詢傳送到仍在考慮的所有介面卡上的所有 DNS 伺服器,並等待 8 秒以獲得響應。引用文件:https://learn.microsoft.com/zh-cn/previous-versions/windows/it-pro/windows-server-2008-R2-and-2008/dd197552(v=ws.10)?redirectedfrom=MSDN
因為錯誤訊息“The remote name could not be resolved ”已經非常明確的指出了是域名無法解析的錯誤,所以排查思路是:
- 如果配置了自定義DNS伺服器,可以在DNS伺服器中檢視日誌,檢查是否有未能解析的錯誤。
- 如果沒有配置,則需要檢查Azure DNS伺服器日誌。如果在Azure DNS伺服器的解析日誌中發現 RCODE 為 NXDOMAIN(3) 的錯誤碼,說明Azure DNS伺服器上並未找到所查詢目的域名相關A記錄
- 此外,如果配置有多個DNS伺服器,會存在 第一個DNS Server沒有響應時, 會向其他DNS Server傳送解析請求,並延長等待時間(1-2-2-4-8秒),如都沒有返回或返回錯誤,則APIM日誌記錄 not resolved。
參考資料
APIM中對後端API服務的DNS域名快取問題 :https://www.cnblogs.com/lulight/p/13590755.html
DNS Processes and Interactions : https://learn.microsoft.com/zh-cn/previous-versions/windows/it-pro/windows-server-2008-R2-and-2008/dd197552(v=ws.10)?redirectedfrom=MSDN