問題描述
使用Event Hub消費事件時,出現的各種客戶端錯誤的解讀。(再後期遇見新的錯誤資訊,會持續新增進此說明)
一:再Linux中執行Event Hub消費端程式,出現Too many open files
解讀:該資訊是指java程式開啟作業系統檔案控制程式碼數超出了作業系統的限制,排查作業系統的檔案控制程式碼的限制是不是預設的1024,如果是,請改為無限制。
使用ulimit -a 或者 ulimit -n 檢視控制程式碼數 open files (-n) 1024
配置檔案/etc/security/limits.conf
在該配置檔案中新增
* soft nofile 65535
* hard nofile 65535
二:New receiver 'nil' with higher epoch of '197' is created hence current receiver 'nil' with epoch '196' is getting disconnected
錯誤訊息: java.util.concurrent.CompletionException: com.microsoft.azure.eventhubs.ReceiverDisconnectedException: New receiver 'nil' with higher epoch of '197' is created hence current receiver 'nil' with epoch '196' is getting disconnected. If you are recreating the receiver, make sure a higher epoch is used. TrackingId:xxxxxxxxxxxxxxx, SystemTracker:xxxxxxx:eventhub:xxxxxxx|$default, Timestamp:2020-10-20T15:50:16, errorContext[NS: xxxxxxxxx.servicebus.chinacloudapi.cn, PATH: xxxxxxxxx/ConsumerGroups/$Default/Partitions/3, REFERENCE_ID: xxxxxxxxxx, PREFETCH_COUNT: 300, LINK_CREDIT: 300, PREFETCH_Q_LEN: 0] java.util.concurrent.ExecutionException: com.microsoft.azure.eventprocessorhost.ExceptionWithAction: java.lang.RuntimeException: Lease lost while updating checkpoint |
解讀:消費者程式會為每個訊息分割槽建立單獨的消費執行緒,消費執行緒跟分割槽是一對一的關係,當有額外的消費程式去消費同樣的eventhub時,並儲存checkpoint到同一個位置時,就會發生partition的再分配,或者,當其中一個消費執行緒出現問題時,客戶端程式會嘗試恢復並接手失敗執行緒所有的分割槽。通常情況下該錯誤可以忽略。
三:com.microsoft.azure.eventprocessorhost.ExceptionWithAction:The client could not finish the operation within specified maximum execution timeout.
解讀:客戶端程式在消費後,將消費offset存入Storage時,發生網路超時,建議您排查下客戶端網路情況。
四:com.microsoft.azure.eventhubs.EventHubException: The specified partition is invalid for an EventHub partition sender or receiver. It should be between 0 and 1.
解讀:客戶端程式在消費eventhub資料時,指定了錯誤的分割槽資訊。
五:com.microsoft.azure.eventhubs.EventHubException: The supplied offset '0' is invalid. The last offset in the system is '-1'
解讀:客戶端在消費eventhub資料時,提交了錯誤的offset值,不能設定初始為0。