Go語言HTTP請求流式寫入body

背景

最近在開發一個功能時，需要通過 http 協議上報大量的日誌內容，但是在 Go 標準庫裡的 http client 的 API 是這樣的：

http.NewRequest(method, url string, body io.Reader)

body 是通過io.Reader介面來傳遞，並沒有暴露一個io.Writer介面來提供寫入的辦法，先來看看正常情況下怎麼寫入一個body，示例：

buf := bytes.NewBuffer([]byte("hello"))
http.Post("localhost:8099/report","text/pain",buf)

需要先把要寫入的資料放在Buffer中，放記憶體快取著，但是我需要寫入大量的資料，如果都放記憶體裡肯定要 OOM 了，http client 並沒有提供流式寫入的方法，我這麼大的資料量直接用Buffer肯定是不行的，最後在 google 了一番之後找到了解決辦法。

使用 io.pipe

呼叫io.pipe()方法會返回Reader和Writer介面實現物件，通過Writer寫資料，Reader就可以讀到，利用這個特性就可以實現流式的寫入，開一個協程來寫，然後把Reader傳遞到方法中，就可以實現 http client body 的流式寫入了。

程式碼示例：

pr, rw := io.Pipe()
// 開協程寫入大量資料
go func(){
    for i := 0; i < 100000; i++ {
        rw.Write([]byte(fmt.Sprintf("line:%d\r\n", i)))
    }
    rw.Close()
}()
// 傳遞Reader
http.Post("localhost:8099/report","text/pain",buf)

原始碼閱讀

目的

瞭解 go 中 http client 對於 body 的傳輸是如何處理的。

開始

在構建 Request 的時候，會斷言 body 引數的型別，當型別為*bytes.Buffer、*bytes.Reader、*strings.Reader的時候，可以直接通過Len()方法取出長度，用於Content-Length請求頭，相關程式碼net/http/request.go#L872-L914：

if body != nil {
    switch v := body.(type) {
    case *bytes.Buffer:
        req.ContentLength = int64(v.Len())
        buf := v.Bytes()
        req.GetBody = func() (io.ReadCloser, error) {
            r := bytes.NewReader(buf)
            return ioutil.NopCloser(r), nil
        }
    case *bytes.Reader:
        req.ContentLength = int64(v.Len())
        snapshot := *v
        req.GetBody = func() (io.ReadCloser, error) {
            r := snapshot
            return ioutil.NopCloser(&r), nil
        }
    case *strings.Reader:
        req.ContentLength = int64(v.Len())
        snapshot := *v
        req.GetBody = func() (io.ReadCloser, error) {
            r := snapshot
            return ioutil.NopCloser(&r), nil
        }
    default:
    }
    if req.GetBody != nil && req.ContentLength == 0 {
        req.Body = NoBody
        req.GetBody = func() (io.ReadCloser, error) { return NoBody, nil }
    }
}

在連結建立的時候，會通過body和上一步中得到的ContentLength來進行判斷，如果body!=nil並且ContentLength==0時，可能就會啟用Chunked編碼進行傳輸，相關程式碼net/http/transfer.go#L82-L96：

case *Request:
    if rr.ContentLength != 0 && rr.Body == nil {
        return nil, fmt.Errorf("http: Request.ContentLength=%d with nil Body", rr.ContentLength)
    }
    t.Method = valueOrDefault(rr.Method, "GET")
    t.Close = rr.Close
    t.TransferEncoding = rr.TransferEncoding
    t.Header = rr.Header
    t.Trailer = rr.Trailer
    t.Body = rr.Body
    t.BodyCloser = rr.Body
    // 當body為非nil，並且ContentLength==0時，這裡返回-1
    t.ContentLength = rr.outgoingLength()
    // TransferEncoding沒有手動設定，並且請求方法為PUT、POST、PATCH時，會啟用chunked編碼傳輸
    if t.ContentLength < 0 && len(t.TransferEncoding) == 0 && t.shouldSendChunkedRequestBody() {
        t.TransferEncoding = []string{"chunked"}
    }

驗證(一)

按照對原始碼的理解，可以得知在使用io.pipe()方法進行流式傳輸時，會使用chunked編碼進行傳輸，通過以下程式碼進行驗證：

服務端

func main(){
    http.HandleFunc("/report", func(writer http.ResponseWriter, request *http.Request) {

    })
    http.ListenAndServe(":8099", nil)
}

客戶端

func main(){
    pr, rw := io.Pipe()
    go func(){
        for i := 0; i < 100; i++ {
            rw.Write([]byte(fmt.Sprintf("line:%d\r\n", i)))
        }
        rw.Close()
    }()
    http.Post("localhost:8099/report","text/pain",buf)
}

先執行服務端，然後執行客戶端，並且使用WireShake進行抓包分析，結果如下：

可以看到和預想的結果一樣。

驗證(二)

在資料量大的時候chunked編碼會增加額外的開銷，包括編解碼和額外的報文開銷，能不能不用chunked編碼來進行流式傳輸呢？通過原始碼可以得知，當ContentLength不為 0 時，如果能預先計算出待傳輸的body size，是不是就能避免chunked編碼呢？思路就到這，接著就是寫程式碼驗證：

服務端

func main(){
    http.HandleFunc("/report", func(writer http.ResponseWriter, request *http.Request) {

    })
    http.ListenAndServe(":8099", nil)
}

客戶端

count := 100
line := []byte("line\r\n")
pr, rw := io.Pipe()
go func() {
    for i := 0; i < count; i++ {
        rw.Write(line)
    }
    rw.Close()
}()
// 構造request物件
request, err := http.NewRequest("POST", "http://localhost:8099/report", pr)
if err != nil {
    log.Fatal(err)
}
// 提前計算出ContentLength
request.ContentLength = int64(len(line) * count)
// 發起請求
http.DefaultClient.Do(request)

抓包結果：

可以看到確實直接使用的Content-Length進行傳輸，沒有進行chunked編碼了。

總結

本文的目的主要是記錄 go 語言中http client如何進行流式的寫入，並通過閱讀原始碼瞭解http client內部對 body 的寫入是如何進行處理的，通過兩個驗證可以得知，如果能提前計算出ContentLength並且對效能要求比較苛刻的情況下，可以通過手動設定ContentLength來優化效能。

Go語言HTTP請求流式寫入body

背景

使用 io.pipe

原始碼閱讀

目的

開始

驗證(一)

驗證(二)

總結

相關文章