nginx的upstream異常

weixin_33785972發表於2017-02-09

異常

upstream server temporarily disabled while connecting to upstream
no live upstreams while connecting to upstream

max_fails與fail_timeout

max_fails預設值為1,fail_timeout預設值為10秒。

nginx可以通過設定max_fails(最大嘗試失敗次數)和fail_timeout(失效時間,在到達最大嘗試失敗次數後,在fail_timeout的時間範圍內節點被置為失效,除非所有節點都失效,否則該時間內,節點不進行恢復)對節點失敗的嘗試次數和失效時間進行設定,當超過最大嘗試次數或失效時間未超過配置失效時間,則nginx會對節點狀會置為失效狀態,nginx不對該後端進行連線,直到超過失效時間或者所有節點都失效後,該節點重新置為有效,重新探測.

upstream backend {
    server backend1.example.com weight=5;
    server 127.0.0.1:8080       max_fails=3 fail_timeout=30s;
    server unix:/tmp/backend3;

    server backup1.example.com  backup;
}

fail的標準

比如

connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "POST /demo HTTP/1.1", subrequest: "/capture/getstatus", upstream: "http://192.168.99.100:8080/api/demo/

比如

upstream timed out (110: Connection timed out) while reading response header from upstream

Nginx 預設判斷失敗節點狀態以connect refuse和time out狀態為準,不以HTTP錯誤狀態進行判斷失敗,因為HTTP只要能返回狀態說明該節點還可以正常連線,所以nginx判斷其還是存活狀態;除非新增了proxy_next_upstream指令設定對404、502、503、504、500和time out等錯誤進行轉到備機處理,在next_upstream過程中,會對fails進行累加,如果備用機處理還是錯誤則直接返回錯誤資訊(但404不進行記錄到錯誤數,如果不配置錯誤狀態也不對其進行錯誤狀態記錄),綜述,nginx記錄錯誤數量只記錄timeout 、connect refuse、502、500、503、504這6種狀態,timeout和connect refuse是永遠被記錄錯誤狀態,而502、500、503、504只有在配置proxy_next_upstream後nginx才會記錄這4種HTTP錯誤到fails中,當fails大於等於max_fails時,則該節點失效.

探測機制

如果探測所有節點均失效,備機也為失效時,那麼nginx會對所有節點恢復為有效,重新嘗試探測有效節點,如果探測到有效節點則返回正確節點內容,如果還是全部錯誤,那麼繼續探測下去,當沒有正確資訊時,節點失效時預設返回狀態為502,但是下次訪問節點時會繼續探測正確節點,直到找到正確的為止。

實驗log

upstream test_server{
        server 192.168.99.100:80801;
        server 192.168.99.100:80802;
        server 192.168.99.100:80803;
    }

location /api/test/demo{
            proxy_pass http://test_server/api/demo;
}    
location /api/demo{
            default_type application/json;
            content_by_lua_file conf/lua/demo.lua;
}

lua

local cjson = require "cjson.safe"
testres = ngx.location.capture("/api/test/demo",{
    method= ngx.HTTP_POST,
    body = "arg1=xxxx&arg2=xxxxx"
})
ngx.log(ngx.ERR,"status"..testres.status)
local testbody = cjson.decode(testres.body)
ngx.log(ngx.ERR,testbody==nil)

請求192.168.99.100:8080/api/demo,裡頭的lua會發起一個capture,請求/api/test/demo
請求一次

2017/02/09 14:48:57 [error] 5#5: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80801/api/demo", host: "192.168.99.100:8080"
2017/02/09 14:48:57 [warn] 5#5: *1 upstream server temporarily disabled while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80801/api/demo", host: "192.168.99.100:8080"
2017/02/09 14:48:57 [error] 5#5: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80802/api/demo", host: "192.168.99.100:8080"
2017/02/09 14:48:57 [warn] 5#5: *1 upstream server temporarily disabled while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80802/api/demo", host: "192.168.99.100:8080"
2017/02/09 14:48:57 [error] 5#5: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80803/api/demo", host: "192.168.99.100:8080"
2017/02/09 14:48:57 [warn] 5#5: *1 upstream server temporarily disabled while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://192.168.99.100:80803/api/demo", host: "192.168.99.100:8080"
2017/02/09 14:48:57 [error] 5#5: *1 [lua] demo.lua:44: status502 while sending to client, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", host: "192.168.99.100:8080"

對upstream逐個請求,都失敗,則capture的subrequest返回502,對client返回的status code取決於lua指令碼

再請求一次

2017/02/09 15:09:34 [error] 6#6: *11 no live upstreams while connecting to upstream, client: 192.168.99.1, server: , request: "POST /api/demo HTTP/1.1", subrequest: "/api/test/demo", upstream: "http://test_server/api/demo", host: "192.168.99.100:8080"

該upstream下面的server都掛的情況下出現no live upstreams while connecting to upstream

doc

相關文章