Go併發模式：管道和顯式取消

Codefor發表於2014-04-25

Go模式

引言

Go併發原語使得構建流式資料管道，高效利用I/O和多核變得簡單。這篇文章介紹了幾個管道例子，重點指出在操作失敗時的細微差別，並介紹了優雅處理失敗的技術。

什麼是管道？

Go沒有正式的管道定義。管道只是眾多併發程式的一類。一般的，一個管道就是一些列的由channel連線起來的階段。每個階段都有執行相同邏輯的goroutine。在每個階段中，goroutine

從channel讀取上游資料
在資料上執行一些操作，通常會產生新的資料
通過channel將資料發往下游

每個階段都可以有任意個輸入channel和輸出channel，除了第一個和最有一個channel（只有輸入channel或只有輸出channel）。第一個步驟通常叫資料來源或者生產者，最後一個叫做儲存池或者消費者。

我們先從一個簡單的管道例子來解釋這些概念和技術，稍後我們會介紹一個更為複雜的例子。

數字的平方

假設管道有三個階段。

第一步，gen函式,是一個將數字列表轉換到一個channel中的函式。Gen函式啟動了一個goroutine，將數字傳送到channel，並在所有數字都傳送完後關閉channel。

func gen(nums ...int) <-chan int {
    out := make(chan int)
    go func() {
        for _, n := range nums {
            out <- n
        }
        close(out)
    }()
    return out
}

func gen(nums ...int) <-chan int {

out := make(chan int)

go func() {

for _, n := range nums {

out <- n

}

close(out)

}()

return out

}

第二個階段，sq，從上面的channel接收數字，並返回一個包含所有收到數字的平方的channel。在上游channel關閉後，這個階段已經往下游傳送完所有的結果，然後關閉輸出channel：

func sq(in <-chan int) <-chan int {
    out := make(chan int)
    go func() {
        for n := range in {
            out <- n * n
        }
        close(out)
    }()
    return out
}

func sq(in <-chan int) <-chan int {

out := make(chan int)

go func() {

for n := range in {

out <- n * n

}

close(out)

}()

return out

}

main函式建立這個管道，並執行第一個階段，從第二個階段接收結果並逐個列印，直到channel被關閉。

func main() {
    // Set up the pipeline.
    c := gen(2, 3)
    out := sq(c)

    // Consume the output.
    fmt.Println(<-out) // 4
    fmt.Println(<-out) // 9
}

func main() {

// Set up the pipeline.

c := gen(2, 3)

out := sq(c)

// Consume the output.

fmt.Println(<-out) // 4

fmt.Println(<-out) // 9

}

因為sq對輸入channel和輸出channel擁有相同的型別，我們可以任意次的組合他們。我們也可以像其他階段一樣，將main函式重寫成一個迴圈遍歷。

func main() {
    // Set up the pipeline and consume the output.
    for n := range sq(sq(gen(2, 3))) {
        fmt.Println(n) // 16 then 81
    }
}

func main() {

// Set up the pipeline and consume the output.

for n := range sq(sq(gen(2, 3))) {

fmt.Println(n) // 16 then 81

}

扇出扇入（Fan-out, fan-in）

多個函式可以從同一個channel讀取資料，直到這個channel關閉，這叫扇出。這是一種多個工作例項分散式地協作以並行利用CPU和I/O的方式。

一個函式可以從多個輸入讀取並處理資料，直到所有的輸入channel都被關閉。這個函式會將所有輸入channel匯入一個單一的channel。這個單一的channel在所有輸入channel都關閉後才會關閉。這叫做扇入。

我們可以設定我們的管道執行兩個sq例項，每一個例項都從相同的輸入channel讀取資料。我們引入了一個新的函式，merge，來扇入結果:

func main() {
    in := gen(2, 3)

    // Distribute the sq work across two goroutines that both read from in.
    c1 := sq(in)
    c2 := sq(in)

    // Consume the merged output from c1 and c2.
    for n := range merge(c1, c2) {
        fmt.Println(n) // 4 then 9, or 9 then 4
    }
}

func main() {

in := gen(2, 3)

// Distribute the sq work across two goroutines that both read from in.

c1 := sq(in)

c2 := sq(in)

// Consume the merged output from c1 and c2.

for n := range merge(c1, c2) {

fmt.Println(n) // 4 then 9, or 9 then 4

}

merge函式為每一個輸入channel啟動一個goroutine，goroutine將資料拷貝到同一個輸出channel。這樣就將多個channel轉換成一個channel。一旦所有的output goroutine啟動起來，merge就啟動另一個goroutine，在所有輸入拷貝完畢後關閉輸出channel。
向一個關閉了的channel傳送資料會觸發異常，所以在呼叫close之前確認所有的傳送動作都執行完畢很重要。sync.WaitGroup型別為這種同步提供了一種簡便的方法:

func merge(cs ...<-chan int) <-chan int {
    var wg sync.WaitGroup
    out := make(chan int)

    // Start an output goroutine for each input channel in cs.  output
    // copies values from c to out until c is closed, then calls wg.Done.
    output := func(c <-chan int) {
        for n := range c {
            out <- n
        }
        wg.Done()
    }
    wg.Add(len(cs))
    for _, c := range cs {
        go output(c)
    }

    // Start a goroutine to close out once all the output goroutines are
    // done.  This must start after the wg.Add call.
    go func() {
        wg.Wait()
        close(out)
    }()
    return out
}

func merge(cs ...<-chan int) <-chan int {

var wg sync.WaitGroup

out := make(chan int)

// Start an output goroutine for each input channel in cs. output

// copies values from c to out until c is closed, then calls wg.Done.

output := func(c <-chan int) {

for n := range c {

out <- n

}

wg.Done()

}

wg.Add(len(cs))

for _, c := range cs {

go output(c)

}

// Start a goroutine to close out once all the output goroutines are

// done. This must start after the wg.Add call.

go func() {

wg.Wait()

close(out)

}()

return out

}

停止的藝術

我們所有的管道函式都遵循一種模式：

傳送者在傳送完畢時關閉其輸出channel。
接收者持續從輸入管道接收資料直到輸入管道關閉。

這種模式使得每一個接收函式都能寫成一個range迴圈，保證所有的goroutine在資料成功傳送到下游後就關閉。

但是在真實的案例中，並不是所有的輸入資料都需要被接收處理。有些時候是故意這麼設計的：接收者可能只需要資料的子集就夠了；或者更一般的，因為輸入資料有錯誤而導致接收函式提早退出。上面任何一種情況下，接收者都不應該繼續等待後續的資料到來，並且我們希望上游函式停止生成後續步驟已經不需要的資料。

在我們的管道例子中，如果一個階段無法消費所有的輸入資料，那些傳送這些資料的goroutine就會一直阻塞下去：

    // Consume the first value from output.
    out := merge(c1, c2)
    fmt.Println(&lt;-out) // 4 or 9
    return
    // Since we didn't receive the second value from out,
    // one of the output goroutines is hung attempting to send it.
}

// Consume the first value from output.

out := merge(c1, c2)

fmt.Println(<-out) // 4 or 9

return

// Since we didn't receive the second value from out,

// one of the output goroutines is hung attempting to send it.

}

這是一種資源洩漏：goroutine會佔用記憶體和執行時資源。goroutine棧持有的堆引用會阻止GC回收資源。而且goroutine不能被垃圾回收，必須主動退出。

我們必須重新設計管道中的上游函式，在下游函式無法接收所有輸入資料時退出。一種方法就是讓輸出channel擁有一定的快取。快取可以儲存一定數量的資料。如果快取空間足夠，傳送操作就會馬上返回:

c := make(chan int, 2) // buffer size 2
c <- 1  // succeeds immediately
c <- 2  // succeeds immediately
c <- 3  // blocks until another goroutine does <-c and receives 1

c := make(chan int, 2) // buffer size 2

c <- 1 // succeeds immediately

c <- 2 // succeeds immediately

c <- 3 // blocks until another goroutine does <-c and receives 1

如果在channel建立時就知道需要傳送資料的數量，帶快取的channel會簡化程式碼。例如，我們可以重寫gen函式，拷貝一系列的整數到一個帶快取的channel而不是建立一個新的goroutine：

func gen(nums ...int) <-chan int {
    out := make(chan int, len(nums))
    for _, n := range nums {
        out <- n
    }
    close(out)
    return out
}

func gen(nums ...int) <-chan int {

out := make(chan int, len(nums))

for _, n := range nums {

out <- n

}

close(out)

return out

}

反過來我們看管道中被阻塞的goroutine，我們可以考慮為merge函式返回的輸出channel增加一個快取：

func merge(cs ...<-chan int) <-chan int {
    var wg sync.WaitGroup
    out := make(chan int, 1) // enough space for the unread inputs
    // ... the rest is unchanged ...

func merge(cs ...<-chan int) <-chan int {

var wg sync.WaitGroup

out := make(chan int, 1) // enough space for the unread inputs

// ... the rest is unchanged ...

雖然這樣可以避免了程式中goroutine的阻塞，但這是很爛的程式碼。選擇快取大小為1取決於知道merge函式接收數字的數量和下游函式消費數字的數量。這是很不穩定的：如果我們向gen多傳送了一個資料，或者下游函式少消費了資料，我們就又一次阻塞了goroutine。

然而，我們需要提供一種方式，下游函式可以通知上游傳送者下游要停止接收資料。

顯式取消

當main函式決定在沒有從out接收所有的資料而要退出時，它需要通知上游的goroutine取消即將傳送的資料。可以通過向一個叫做done的channel傳送資料來實現。因為有兩個潛在阻塞的goroutine，main函式會傳送兩個資料：

func main() {
    in := gen(2, 3)

    // Distribute the sq work across two goroutines that both read from in.
    c1 := sq(in)
    c2 := sq(in)

    // Consume the first value from output.
    done := make(chan struct{}, 2)
    out := merge(done, c1, c2)
    fmt.Println(<-out) // 4 or 9

    // Tell the remaining senders we're leaving.
    done <- struct{}{}
    done <- struct{}{}
}

func main() {

in := gen(2, 3)

// Distribute the sq work across two goroutines that both read from in.

c1 := sq(in)

c2 := sq(in)

// Consume the first value from output.

done := make(chan struct{}, 2)

out := merge(done, c1, c2)

fmt.Println(<-out) // 4 or 9

// Tell the remaining senders we're leaving.

done <- struct{}{}

}

對傳送goroutine而言，需要將傳送操作替換為一個select語句，要麼out發生傳送操作，要麼從done接收資料。done的資料型別是空的struct，因為其值無關緊要：僅僅表示out需要取消傳送操作。output 繼續在輸入channel迴圈執行，因此上游函式是不會阻塞的。（接下來我們會討論如何讓迴圈提早退出）

func merge(done <-chan struct{}, cs ...<-chan int) <-chan int {
    var wg sync.WaitGroup
    out := make(chan int)

    // Start an output goroutine for each input channel in cs.  output
    // copies values from c to out until c is closed or it receives a value
    // from done, then output calls wg.Done.
    output := func(c <-chan int) {
        for n := range c {
            select {
            case out <- n:
            case <-done:
            }
        }
        wg.Done()
    }
    // ... the rest is unchanged ...

func merge(done <-chan struct{}, cs ...<-chan int) <-chan int {

var wg sync.WaitGroup

out := make(chan int)

// Start an output goroutine for each input channel in cs. output

// copies values from c to out until c is closed or it receives a value

// from done, then output calls wg.Done.

output := func(c <-chan int) {

for n := range c {

select {

case out <- n:

case <-done:

}

wg.Done()

}

// ... the rest is unchanged ...

這種方法有一個問題：每一個下游函式需要知道潛在可能阻塞的上游傳送者的數量，以傳送響應的訊號讓其提早退出。跟蹤這些數量是無趣的而且很容易出錯。

我們需要一種能夠讓未知或無界數量的goroutine都能夠停止向下遊傳送資料的方法。在Go中，我們可以通過關閉一個channel實現。因為從一個關閉了的channel執行接收操作總能馬上成功，並返回相應資料型別的零值。

這意味著main函式僅通過關閉done就能實現將所有的傳送者解除阻塞。關閉操作是一個高效的對傳送者的廣播訊號。我們擴充套件管道中所有的函式接受done作為一個引數，並通過defer來實現相應channel的關閉操作。因此，無論main函式在哪一行退出都會通知上游退出。

func main() {
    // Set up a done channel that's shared by the whole pipeline,
    // and close that channel when this pipeline exits, as a signal
    // for all the goroutines we started to exit.
    done := make(chan struct{})
    defer close(done)

    in := gen(done, 2, 3)

    // Distribute the sq work across two goroutines that both read from in.
    c1 := sq(done, in)
    c2 := sq(done, in)

    // Consume the first value from output.
    out := merge(done, c1, c2)
    fmt.Println(<-out) // 4 or 9

    // done will be closed by the deferred call.
}

func main() {

// Set up a done channel that's shared by the whole pipeline,

// and close that channel when this pipeline exits, as a signal

// for all the goroutines we started to exit.

done := make(chan struct{})

defer close(done)

in := gen(done, 2, 3)

// Distribute the sq work across two goroutines that both read from in.

c1 := sq(done, in)

c2 := sq(done, in)

// Consume the first value from output.

out := merge(done, c1, c2)

fmt.Println(<-out) // 4 or 9

// done will be closed by the deferred call.

}

現在每一個管道函式在done被關閉後就可以馬上返回了。merge函式中的output可以在接收管道的資料消費完之前返回，因為output函式知道上游傳送者sq會在done關閉後停止產生資料。同時，output通過defer語句保證wq.Done會在所有退出路徑上呼叫。

func merge(done <-chan struct{}, cs ...<-chan int) <-chan int {
    var wg sync.WaitGroup
    out := make(chan int)

    // Start an output goroutine for each input channel in cs.  output
    // copies values from c to out until c or done is closed, then calls
    // wg.Done.
    output := func(c <-chan int) {
        defer wg.Done()
        for n := range c {
            select {
            case out <- n:
            case <-done:
                return
            }
        }
    }
    // ... the rest is unchanged ...

func merge(done <-chan struct{}, cs ...<-chan int) <-chan int {

var wg sync.WaitGroup

out := make(chan int)

// Start an output goroutine for each input channel in cs. output

// copies values from c to out until c or done is closed, then calls

// wg.Done.

output := func(c <-chan int) {

defer wg.Done()

for n := range c {

select {

case out <- n:

case <-done:

return

}

// ... the rest is unchanged ...

類似的，sq也可以在done關閉後馬上返回。sq通過defer語句使得任何退出路徑都能關閉其輸出channel out。

func sq(done <-chan struct{}, in <-chan int) <-chan int {
    out := make(chan int)
    go func() {
        defer close(out)
        for n := range in {
            select {
            case out <- n * n:
            case <-done:
                return
            }
        }
    }()
    return out
}

func sq(done <-chan struct{}, in <-chan int) <-chan int {

out := make(chan int)

go func() {

defer close(out)

for n := range in {

select {

case out <- n * n:

case <-done:

return

}

}()

return out

}

管道構建的指導思想如下：

每一個階段在所有傳送操作完成後關閉輸出channel。
每一個階段持續從輸入channel接收資料直到輸入channel被關閉或者生產者被解除阻塞（譯者：生產者退出）。

管道解除生產者阻塞有兩種方法：要麼保證有足夠的快取空間儲存將要被生產的資料，要麼顯式的通知生產者消費者要取消接收資料。

樹形摘要

讓我們來看一個更為實際的管道。

MD5是一個資訊摘要演算法，對於檔案校驗非常有用。命令列工具md5sum很有用，可以列印一系列檔案的摘要值。

% md5sum *.go
d47c2bbc28298ca9befdfbc5d3aa4e65  bounded.go
ee869afd31f83cbb2d10ee81b2b831dc  parallel.go
b88175e65fdcbc01ac08aaf1fd9b5e96  serial.go

% md5sum *.go

d47c2bbc28298ca9befdfbc5d3aa4e65 bounded.go

ee869afd31f83cbb2d10ee81b2b831dc parallel.go

b88175e65fdcbc01ac08aaf1fd9b5e96 serial.go

我們的例子程式和md5sum類似，但是接受一個單一的資料夾作為引數，列印該資料夾下每一個普通檔案的摘要值，並按路徑名稱排序。

% go run serial.go .
d47c2bbc28298ca9befdfbc5d3aa4e65  bounded.go
ee869afd31f83cbb2d10ee81b2b831dc  parallel.go
b88175e65fdcbc01ac08aaf1fd9b5e96  serial.go

% go run serial.go .

d47c2bbc28298ca9befdfbc5d3aa4e65 bounded.go

ee869afd31f83cbb2d10ee81b2b831dc parallel.go

b88175e65fdcbc01ac08aaf1fd9b5e96 serial.go

我們程式的main函式呼叫一個工具函式MD5ALL，該函式返回一個從路徑名稱到摘要值的雜湊表，然後排序並輸出結果：

func main() {
    // Calculate the MD5 sum of all files under the specified directory,
    // then print the results sorted by path name.
    m, err := MD5All(os.Args[1])
    if err != nil {
        fmt.Println(err)
        return
    }
    var paths []string
    for path := range m {
        paths = append(paths, path)
    }
    sort.Strings(paths)
    for _, path := range paths {
        fmt.Printf("%x  %s\n", m[path], path)
    }
}

func main() {

// Calculate the MD5 sum of all files under the specified directory,

// then print the results sorted by path name.

m, err := MD5All(os.Args[1])

if err != nil {

fmt.Println(err)

return

}

var paths []string

for path := range m {

paths = append(paths, path)

}

sort.Strings(paths)

for _, path := range paths {

fmt.Printf("%x %s\n", m[path], path)

}

MD5ALL是我們討論的核心。在 serial.go中，沒有采用任何併發，僅僅遍歷資料夾，讀取檔案並求出摘要值。

// MD5All reads all the files in the file tree rooted at root and returns a map
// from file path to the MD5 sum of the file's contents.  If the directory walk
// fails or any read operation fails, MD5All returns an error.
func MD5All(root string) (map[string][md5.Size]byte, error) {
    m := make(map[string][md5.Size]byte)
    err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
        if err != nil {
            return err
        }
        if info.IsDir() {
            return nil
        }
        data, err := ioutil.ReadFile(path)
        if err != nil {
            return err
        }
        m[path] = md5.Sum(data)
        return nil
    })
    if err != nil {
        return nil, err
    }
    return m, nil
}

// MD5All reads all the files in the file tree rooted at root and returns a map

// from file path to the MD5 sum of the file's contents. If the directory walk

// fails or any read operation fails, MD5All returns an error.

func MD5All(root string) (map[string][md5.Size]byte, error) {

m := make(map[string][md5.Size]byte)

err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {

if err != nil {

return err

}

if info.IsDir() {

return nil

}

data, err := ioutil.ReadFile(path)

if err != nil {

return err

}

m[path] = md5.Sum(data)

return nil

})

if err != nil {

return nil, err

}

return m, nil

}

並行摘要求值

在parallel.go中，我們將MD5ALL分成兩階段的管道。第一個階段，sumFiles，遍歷資料夾，每個檔案一個goroutine進行求摘要值，然後將結果傳送一個資料型別為result的channel中：

type result struct {
    path string
    sum  [md5.Size]byte
    err  error
}

type result struct {

path string

sum [md5.Size]byte

err error

}

sumFiles 返回兩個channel：一個用於生成結果，一個用於filepath.Walk返回錯誤。Walk函式為每一個普通檔案啟動一個goroutine，然後檢查done，如果done被關閉，walk馬上就會退出。

func sumFiles(done <-chan struct{}, root string) (<-chan result, <-chan error) {
    // For each regular file, start a goroutine that sums the file and sends
    // the result on c.  Send the result of the walk on errc.
    c := make(chan result)
    errc := make(chan error, 1)
    go func() {
        var wg sync.WaitGroup
        err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
            if err != nil {
                return err
            }
            if info.IsDir() {
                return nil
            }
            wg.Add(1)
            go func() {
                data, err := ioutil.ReadFile(path)
                select {
                case c <- result{path, md5.Sum(data), err}:
                case <-done:
                }
                wg.Done()
            }()
            // Abort the walk if done is closed.
            select {
            case <-done:
                return errors.New("walk canceled")
            default:
                return nil
            }
        })
        // Walk has returned, so all calls to wg.Add are done.  Start a
        // goroutine to close c once all the sends are done.
        go func() {
            wg.Wait()
            close(c)
        }()
        // No select needed here, since errc is buffered.
        errc <- err
    }()
    return c, errc
}

func sumFiles(done <-chan struct{}, root string) (<-chan result, <-chan error) {

// For each regular file, start a goroutine that sums the file and sends

// the result on c. Send the result of the walk on errc.

c := make(chan result)

errc := make(chan error, 1)

go func() {

var wg sync.WaitGroup

err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {

if err != nil {

return err

}

if info.IsDir() {

return nil

}

wg.Add(1)

go func() {

data, err := ioutil.ReadFile(path)

select {

case c <- result{path, md5.Sum(data), err}:

case <-done:

}

wg.Done()

}()

// Abort the walk if done is closed.

select {

case <-done:

return errors.New("walk canceled")

default:

return nil

}

})

// Walk has returned, so all calls to wg.Add are done. Start a

// goroutine to close c once all the sends are done.

go func() {

wg.Wait()

close(c)

}()

// No select needed here, since errc is buffered.

errc <- err

}()

return c, errc

}

MD5All 從c中接收摘要值。MD5All 在遇到錯誤時提前退出，通過defer關閉done。

func MD5All(root string) (map[string][md5.Size]byte, error) {
    // MD5All closes the done channel when it returns; it may do so before
    // receiving all the values from c and errc.
    done := make(chan struct{})
    defer close(done)

    c, errc := sumFiles(done, root)

    m := make(map[string][md5.Size]byte)
    for r := range c {
        if r.err != nil {
            return nil, r.err
        }
        m[r.path] = r.sum
    }
    if err := <-errc; err != nil {
        return nil, err
    }
    return m, nil
}

func MD5All(root string) (map[string][md5.Size]byte, error) {

// MD5All closes the done channel when it returns; it may do so before

// receiving all the values from c and errc.

done := make(chan struct{})

defer close(done)

c, errc := sumFiles(done, root)

m := make(map[string][md5.Size]byte)

for r := range c {

if r.err != nil {

return nil, r.err

}

m[r.path] = r.sum

}

if err := <-errc; err != nil {

return nil, err

}

return m, nil

}

有界並行

parallel.go中實現的MD5ALL，對每一個檔案啟動了一個goroutine。在一個包含大量大檔案的資料夾中，這會導致超過機器可用記憶體的記憶體分配。（譯者注：即發生OOM）

我們可以通過限制讀取檔案的併發度來避免這種情況發生。在bounded.go中，我們通過建立一定數量的goroutine讀取檔案。現在我們的管道現在有三個階段：遍歷資料夾，讀取檔案並計算摘要值，收集摘要值。

第一個階段，walkFiles，輸出資料夾中普通檔案的檔案路徑：

func walkFiles(done <-chan struct{}, root string) (<-chan string, <-chan error) {
    paths := make(chan string)
    errc := make(chan error, 1)
    go func() {
        // Close the paths channel after Walk returns.
        defer close(paths)
        // No select needed for this send, since errc is buffered.
        errc <- filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
            if err != nil {
                return err
            }
            if info.IsDir() {
                return nil
            }
            select {
            case paths <- path:
            case <-done:
                return errors.New("walk canceled")
            }
            return nil
        })
    }()
    return paths, errc
}

func walkFiles(done <-chan struct{}, root string) (<-chan string, <-chan error) {

paths := make(chan string)

errc := make(chan error, 1)

go func() {

// Close the paths channel after Walk returns.

defer close(paths)

// No select needed for this send, since errc is buffered.

errc <- filepath.Walk(root, func(path string, info os.FileInfo, err error) error {

if err != nil {

return err

}

if info.IsDir() {

return nil

}

select {

case paths <- path:

case <-done:

return errors.New("walk canceled")

}

return nil

})

}()

return paths, errc

}

中間的階段啟動一定數量的digester goroutine，從paths接收檔名稱，並向c傳送result結構:

func digester(done <-chan struct{}, paths <-chan string, c chan<- result) {
    for path := range paths {
        data, err := ioutil.ReadFile(path)
        select {
        case c <- result{path, md5.Sum(data), err}:
        case <-done:
            return
        }
    }
}

func digester(done <-chan struct{}, paths <-chan string, c chan<- result) {

for path := range paths {

data, err := ioutil.ReadFile(path)

select {

case c <- result{path, md5.Sum(data), err}:

case <-done:

return

}

和前一個例子不同，digester並不關閉其輸出channel，因為輸出channel是共享的，多個goroutine會向同一個channel傳送資料。MD5All 會在所有的digesters 結束後關閉響應的channel。

    // Start a fixed number of goroutines to read and digest files.
    c := make(chan result)
    var wg sync.WaitGroup
    const numDigesters = 20
    wg.Add(numDigesters)
    for i := 0; i < numDigesters; i++ {
        go func() {
            digester(done, paths, c)
            wg.Done()
        }()
    }
    go func() {
        wg.Wait()
        close(c)
    }()

// Start a fixed number of goroutines to read and digest files.

c := make(chan result)

var wg sync.WaitGroup

const numDigesters = 20

wg.Add(numDigesters)

for i := 0; i < numDigesters; i++ {

go func() {

digester(done, paths, c)

wg.Done()

}()

}

go func() {

wg.Wait()

close(c)

}()

我們也可以讓每一個digester建立並返回自己的輸出channel，但如果這樣的話，我們需要額外的goroutine來扇入這些結果。
最後一個階段從c中接收所有的result資料，並從errc中檢查錯誤。這種檢查不能在之前的階段做，因為在這之前，walkFiles 可能被阻塞不能往下游傳送資料：

    m := make(map[string][md5.Size]byte)
    for r := range c {
        if r.err != nil {
            return nil, r.err
        }
        m[r.path] = r.sum
    }
    // Check whether the Walk failed.
    if err := <-errc; err != nil {
        return nil, err
    }
    return m, nil
}

m := make(map[string][md5.Size]byte)

for r := range c {

if r.err != nil {

return nil, r.err

}

m[r.path] = r.sum

}

// Check whether the Walk failed.

if err := <-errc; err != nil {

return nil, err

}

return m, nil

}

結論

這篇文章介紹瞭如果用Go構建流式資料管道的技術。在這樣的管道中處理錯誤有點取巧，因為管道中每一個階段可能被阻塞不能往下游傳送資料，下游階段可能已經不關心輸入資料。我們展示了關閉channel如何向所有管道啟動的goroutine廣播一個done訊號，並且定義了正確構建管道的指導思想。

深入閱讀：
• Go併發模式（視訊）展示了Go併發原語的基本概念和幾個實現的方法
• 高階Go併發模式（視訊）包含幾個更為複雜的Go併發原語的使用，尤其是select
• Douglas McIlroy的Squinting at Power Series論文展示了類似Go的併發模式如何為複雜的計算提供優雅的支援。

Go高效併發 11 | 併發模式：Go 語言中即學即用的高效併發模式
2021-02-19
Go模式
go併發-工作池模式
2021-08-02
Go模式
Go語言中的併發模式
2024-11-03
Go模式
極簡設計模式-函式組合和集合管道模式
2021-12-15
設計模式函式
Java併發之顯式鎖
2020-12-08
Java
Go 併發 2.2：錯誤處理模式
2024-04-23
Go模式
Go的程式設計模式一-管道Pipeline
2021-09-18
Go程式設計設計模式
go併發之goroutine和channel，併發控制入門篇
2020-12-15
Go
併發技術3：管道通訊
2018-11-16
《JAVA併發程式設計實戰》取消和關閉
2018-10-28
Java程式設計
GO 語言的併發模式你瞭解多少？
2023-10-14
Go模式
併發程式設計之顯式鎖原理
2019-03-03
程式設計
JS 函式式概念：管道和組合
2023-01-18
JS函式
通過 SingleFlight 模式學習 Go 併發程式設計
2022-04-24
模式Go程式設計
go併發 - channel
2023-11-19
Go
Go 併發操作
2020-10-14
Go
go 併發 map
2019-07-05
Go
Go 併發 -- 通道
2019-04-22
Go
Go併發原理
2019-03-05
Go
15.GO-管道
2022-01-04
Go
《JAVA併發程式設計實戰》顯式鎖
2018-10-30
Java程式設計
Go高效併發 08 | 併發基礎：Goroutines 和 Channels 的宣告與使用
2021-02-05
Go
Go 高階併發
2020-02-26
Go
Go 併發 -- 協程
2019-04-18
Go
GO語言併發
2021-09-09
Go
Go 併發程式設計 - 併發安全（二）
2023-10-31
Go程式設計
【Golang詳解】go語言中併發安全和鎖
2021-10-28
Golang
Java併發學習之任務取消（一）
2020-11-20
Java
第09章 Go語言併發，Golang併發
2020-10-27
Golang
「Golang成長之路」併發之併發模式
2021-10-06
Golang模式
Java併發-顯式鎖篇【可重入鎖+讀寫鎖】
2021-05-23
Java
簡易版管道模式
2020-01-20
模式
GO-併發技術
2018-09-14
Go
Go 併發程式設計
2023-04-28
Go程式設計
Go 筆記之併發
2019-10-26
Go筆記
學習 Go併發模型
2021-08-06
Go模型
Go 系統命令管道操作
2019-09-05
Go
「Golang成長之路」併發之併發模式篇
2021-10-06
Golang模式
併發程式設計-11.取消非同步工作
2024-03-30
程式設計非同步

Go併發模式：管道和顯式取消

引言

什麼是管道？

數字的平方

扇出扇入（Fan-out, fan-in）

停止的藝術

顯式取消

樹形摘要

有界並行

結論

相關文章