記一次線上SpringCloud-Feign請求服務超時異常排查

小鄧學長發表於2022-01-13

      由於近期線上單量暴漲,第三方反饋部分工單業務存在查詢處理失敗現象,經排查是當前系統通過FeignClient呼叫下游系統出現部分超時失敗(異常程式碼貼在下方)。

Caused by: feign.RetryableException: Read timed out executing POST http://xxxx
        at feign.FeignException.errorExecuting(FeignException.java:84) ~[feign-core-10.1.0.jar!/:na]
        at feign.SynchronousMethodHandler.executeAndDecode(SynchronousMethodHandler.java:113) ~[feign-core-10.1.0.jar!/:na]
        at feign.SynchronousMethodHandler.invoke(SynchronousMethodHandler.java:78) ~[feign-core-10.1.0.jar!/:na]
        at feign.ReflectiveFeign$FeignInvocationHandler.invoke(ReflectiveFeign.java:103) ~[feign-core-10.1.0.jar!/:na]
        at com.sun.proxy.$Proxy141.creditReportConvert(Unknown Source) ~[na:na]
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.8.0_121]
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[na:1.8.0_121]
        at java.net.SocketInputStream.read(SocketInputStream.java:171) ~[na:1.8.0_121]
        at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[na:1.8.0_121]
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) ~[na:1.8.0_121]
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) ~[na:1.8.0_121]
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345) ~[na:1.8.0_121]
        at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704) ~[na:1.8.0_121]
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647) ~[na:1.8.0_121]
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569) ~[na:1.8.0_121]
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474) ~[na:1.8.0_121]
        at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) ~[na:1.8.0_121]
        at feign.Client$Default.convertResponse(Client.java:143) ~[feign-core-10.1.0.jar!/:na]
        at feign.Client$Default.execute(Client.java:68) ~[feign-core-10.1.0.jar!/:na]
        at org.springframework.cloud.openfeign.ribbon.FeignLoadBalancer.execute(FeignLoadBalancer.java:93) ~[spring-cloud-openfeign-core-2.1.1.RELEASE.jar!/:2.1.1.RELEASE]
        at org.springframework.cloud.openfeign.ribbon.FeignLoadBalancer.execute(FeignLoadBalancer.java:56) ~[spring-cloud-openfeign-core-2.1.1.RELEASE.jar!/:2.1.1.RELEASE]
        at com.netflix.client.AbstractLoadBalancerAwareClient$1.call(AbstractLoadBalancerAwareClient.java:104) ~[ribbon-loadbalancer-2.3.0.jar!/:2.3.0]
        at com.netflix.loadbalancer.reactive.LoadBalancerCommand$3$1.call(LoadBalancerCommand.java:303) ~[ribbon-loadbalancer-2.3.0.jar!/:2.3.0]

通過系統慢請求捕捉攔截,發現當前請求僅耗時1031毫秒,就觸發Read timed out超時錯誤,本專案與下游專案均註冊在Eureka上面,對這個1秒就超時感到很迷惑,於是開始查閱底層原始碼之旅。

通過跟蹤程式碼可以在feign.Request內部類Options建構函式預設配置連線超時10秒,讀超時60秒。然而本次呼叫請求僅耗時1秒左右就被掐斷連線提示超時錯誤,初步判斷預設這個超時配置並未生效導致。 

寫到這,我們們先回顧下Feign鏈路完整呼叫流程步驟。

可以看出Feign呼叫分為Hystrix、Ribbon兩層,一般高版本的Hystrix預設是關閉的(本專案Hystrix預設是關閉的,那麼現在需要去分析Ribbon層呼叫配置資訊)

可以看到RibbonClientConfiguration預設配置的讀超時和連線超時時間1000毫秒=1秒,在沒有配置超時情況下當前這個超時是符合本次呼叫超時錯誤觸發。

我們可以在FeignLoadBalancer的execute方法中看到當IClientConfig為空的時候進行覆蓋超時時間,預設是取Ribbon的超時時間,並不是Feign Options預設的超時時間。

public FeignLoadBalancer(ILoadBalancer lb, IClientConfig clientConfig,
            ServerIntrospector serverIntrospector) {
        super(lb, clientConfig);
        this.setRetryHandler(RetryHandler.DEFAULT);
        this.clientConfig = clientConfig;
        this.ribbon = RibbonProperties.from(clientConfig);
        RibbonProperties ribbon = this.ribbon;
        this.connectTimeout = ribbon.getConnectTimeout();
        this.readTimeout = ribbon.getReadTimeout();
        this.serverIntrospector = serverIntrospector;
    }

如果application.properties檔案中有配置超時時間,則使用配置的超時時間。否則是取Ribbon預設的超時時間,也就是網上所說的Feign呼叫服務的預設時長是1秒鐘,也就是如果超過1秒沒連線上或者超過1秒沒響應,那麼會相應的報錯。

實際業務中服務如果響應時間超過1秒,我們們可以按照實際響應情況給予配置相對應的超時時間,下面分別貼出properties和yml版本的配置

#properties版本

feign.client.config.default.connectTimeout=60000
feign.client.config.default.readTimeout=60000
#yml版本

feign:
  client:
    config:
      default:
        connectTimeout: 60000
        readTimeout: 60000

 

相關文章