五、GO程式設計模式：MAP-REDUCE

zhaocrazy發表於2022-02-06

原文網址 : https://learnku.com/articles/64715

在本篇文章中，我們學習一下函數語言程式設計的中非常重要的Map、Reduce、Filter的三種操作，這三種操作可以讓我們非常方便靈活地進行一些資料處理——我們的程式中大多數情況下都是在到倒騰資料，尤其對於一些需要統計的業務場景，Map/Reduce/Filter是非常通用的玩法。下面先來看幾個例子：

基本示例

Map示例

下面的程式程式碼中，我們寫了兩個Map函式，這兩個函式需要兩個引數，

一個是字串陣列 []string，說明需要處理的資料一個字串
另一個是一個函式func(s string) string 或 func(s string) int

func MapStrToStr(arr []string, fn func(s string) string) []string {
    var newArray = []string{}
    for _, it := range arr {
        newArray = append(newArray, fn(it))
    }
    return newArray
}

func MapStrToInt(arr []string, fn func(s string) int) []int {
    var newArray = []int{}
    for _, it := range arr {
        newArray = append(newArray, fn(it))
    }
    return newArray
}

整個Map函式執行邏輯都很相似，函式體都是在遍歷第一個引數的陣列，然後，呼叫第二個引數的函式，然後把其值組合成另一個陣列返回。

於是我們就可以這樣使用這兩個函式：

var list = []string{"Hao", "Chen", "MegaEase"}

x := MapStrToStr(list, func(s string) string {
    return strings.ToUpper(s)
})
fmt.Printf("%v\n", x)
//["HAO", "CHEN", "MEGAEASE"]

y := MapStrToInt(list, func(s string) int {
    return len(s)
})
fmt.Printf("%v\n", y)
//[3, 4, 8]

我們可以看到，我們給第一個 MapStrToStr() 傳了函式做的是轉大寫，於是出來的陣列就成了全大寫的，給MapStrToInt() 傳的是算其長度，所以出來的陣列是每個字串的長度。

我們再來看一下Reduce和Filter的函式是什麼樣的。

Reduce 示例

func Reduce(arr []string, fn func(s string) int) int {
    sum := 0
    for _, it := range arr {
        sum += fn(it)
    }
    return sum
}

var list = []string{"Hao", "Chen", "MegaEase"}

x := Reduce(list, func(s string) int {
    return len(s)
})
fmt.Printf("%v\n", x)
// 15

Filter示例

func Filter(arr []int, fn func(n int) bool) []int {
    var newArray = []int{}
    for _, it := range arr {
        if fn(it) {
            newArray = append(newArray, it)
        }
    }
    return newArray
}

var intset = []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
out := Filter(intset, func(n int) bool {
   return n%2 == 1
})
fmt.Printf("%v\n", out)

out = Filter(intset, func(n int) bool {
    return n > 5
})
fmt.Printf("%v\n", out)

下圖是一個比喻，其非常形象地說明了Map-Reduce是的業務語義，其在資料處理中非常有用。

業務示例

通過上面的一些示例，你可能有一些明白，Map/Reduce/Filter只是一種控制邏輯，真正的業務邏輯是在傳給他們的資料和那個函式來定義的。是的，這是一個很經典的“業務邏輯”和“控制邏輯”分離解耦的程式設計模式。下面我們來看一個有業務意義的程式碼，來讓大家強化理解一下什麼叫“控制邏輯”與業務邏輯分離。

員工資訊

首先，我們一個員工物件，以及一些資料

type Employee struct {
    Name     string
    Age      int
    Vacation int
    Salary   int
}

var list = []Employee{
    {"Hao", 44, 0, 8000},
    {"Bob", 34, 10, 5000},
    {"Alice", 23, 5, 9000},
    {"Jack", 26, 0, 4000},
    {"Tom", 48, 9, 7500},
    {"Marry", 29, 0, 6000},
    {"Mike", 32, 8, 4000},
}

泛型Map-Reduce

我們可以看到，上面的Map-Reduce都因為要處理資料的型別不同而需要寫出不同版本的Map-Reduce，雖然他們的程式碼看上去是很類似的。所以，這裡就要帶出來泛型程式設計了，Go語言在本文寫作的時候還不支援泛型（注：Go開發團隊技術負責人Russ Cox在2012年11月21golang-dev上的mail確認了Go泛型(type parameter)將在Go 1.18版本落地，即2022.2月份）。

簡單版 Generic Map

所以，目前的Go語言的泛型只能用 interface{} + reflect來完成，interface{} 可以理解為C中的 void*，Java中的 Object ，reflect是Go的反射機制包，用於在執行時檢查型別。

下面我們來看一下一個非常簡單不作任何型別檢查的泛型的Map函式怎麼寫。

func Map(data interface{}, fn interface{}) []interface{} {
    vfn := reflect.ValueOf(fn)
    vdata := reflect.ValueOf(data)
    result := make([]interface{}, vdata.Len())

    for i := 0; i < vdata.Len(); i++ {
        result[i] = vfn.Call([]reflect.Value{vdata.Index(i)})[0].Interface()
    }
    return result
}

上面的程式碼中，

通過 reflect.ValueOf() 來獲得 interface{} 的值，其中一個是資料 vdata，另一個是函式vfn，
然後通過vfn.Call() 方法來呼叫函式，通過 []refelct.Value{vdata.Index(i)}來獲得資料。
Go語言中的反射的語法還是有點令人費解的，但是簡單看一下手冊還是能夠讀懂的。我這篇文章不講反射，所以相關的基礎知識還請大家自行Google相關的教程。

於是，我們就可以有下面的程式碼——不同型別的資料可以使用相同邏輯的Map()程式碼。

square := func(x int) int {
  return x * x
}
nums := []int{1, 2, 3, 4}

squared_arr := Map(nums,square)
fmt.Println(squared_arr)
//[1 4 9 16]



upcase := func(s string) string {
  return strings.ToUpper(s)
}
strs := []string{"Hao", "Chen", "MegaEase"}
upstrs := Map(strs, upcase);
fmt.Println(upstrs)
//[HAO CHEN MEGAEASE]

但是因為反射是執行時的事，所以，如果型別什麼出問題的話，就會有執行時的錯誤。比如：

x := Map(5, 5)
fmt.Println(x)

上面的程式碼可以很輕鬆的編譯通過，但是在執行時就出問題了，還是panic錯誤……

panic: reflect: call of reflect.Value.Len on int Value

goroutine 1 [running]:
reflect.Value.Len(0x10b5240, 0x10eeb58, 0x82, 0x10716bc)
        /usr/local/Cellar/go/1.15.3/libexec/src/reflect/value.go:1162 +0x185
main.Map(0x10b5240, 0x10eeb58, 0x10b5240, 0x10eeb60, 0x1, 0x14, 0x0)
        /Users/chenhao/.../map.go:12 +0x16b
main.main()
        /Users/chenhao/.../map.go:42 +0x465
exit status 2

健壯版的Generic Map

所以，如果要寫一個健壯的程式，對於這種用interface{} 的“過度泛型”，就需要我們自己來做型別檢查。下面是一個有型別檢查的Map程式碼：

func Transform(slice, function interface{}) interface{} {
  return transform(slice, function, false)
}

func TransformInPlace(slice, function interface{}) interface{} {
  return transform(slice, function, true)
}

func transform(slice, function interface{}, inPlace bool) interface{} {

  //check the <code data-enlighter-language="raw" class="EnlighterJSRAW">slice</code> type is Slice
  sliceInType := reflect.ValueOf(slice)
  if sliceInType.Kind() != reflect.Slice {
    panic("transform: not slice")
  }

  //check the function signature
  fn := reflect.ValueOf(function)
  elemType := sliceInType.Type().Elem()
  if !verifyFuncSignature(fn, elemType, nil) {
    panic("trasform: function must be of type func(" + sliceInType.Type().Elem().String() + ") outputElemType")
  }

  sliceOutType := sliceInType
  if !inPlace {
    sliceOutType = reflect.MakeSlice(reflect.SliceOf(fn.Type().Out(0)), sliceInType.Len(), sliceInType.Len())
  }
  for i := 0; i < sliceInType.Len(); i++ {
    sliceOutType.Index(i).Set(fn.Call([]reflect.Value{sliceInType.Index(i)})[0])
  }
  return sliceOutType.Interface()

}

func verifyFuncSignature(fn reflect.Value, types ...reflect.Type) bool {

  //Check it is a funciton
  if fn.Kind() != reflect.Func {
    return false
  }
  // NumIn() - returns a function type's input parameter count.
  // NumOut() - returns a function type's output parameter count.
  if (fn.Type().NumIn() != len(types)-1) || (fn.Type().NumOut() != 1) {
    return false
  }
  // In() - returns the type of a function type's i'th input parameter.
  for i := 0; i < len(types)-1; i++ {
    if fn.Type().In(i) != types[i] {
      return false
    }
  }
  // Out() - returns the type of a function type's i'th output parameter.
  outType := types[len(types)-1]
  if outType != nil && fn.Type().Out(0) != outType {
    return false
  }
  return true
}

上面的程式碼一下子就複雜起來了，可見，複雜的程式碼都是在處理異常的地方。我不打算Walk through 所有的程式碼，別看程式碼多，但是還是可以讀懂的，下面列幾個程式碼中的要點：

程式碼中沒有使用Map函式，因為和資料結構和關鍵有含義衝突的問題，所以使用Transform，這個來源於 C++ STL庫中的命名。
有兩個版本的函式，一個是返回一個全新的陣列 – Transform()，一個是“就地完成” – TransformInPlace()
在主函式中，用 Kind() 方法檢查了資料型別是不是 Slice，函式型別是不是Func
檢查函式的引數和返回型別是通過 verifyFuncSignature() 來完成的，其中：
- NumIn() – 用來檢查函式的“入參”
- NumOut() 用來檢查函式的“返回值”
如果需要新生成一個Slice，會使用 reflect.MakeSlice() 來完成。

好了，有了上面的這段程式碼，我們的程式碼就很可以很開心的使用了：

可以用於字串陣列

list := []string{"1", "2", "3", "4", "5", "6"}
result := Transform(list, func(a string) string{
    return a +a +a
})
//{"111","222","333","444","555","666"}

可以用於整形陣列

list := []int{1, 2, 3, 4, 5, 6, 7, 8, 9}
TransformInPlace(list, func (a int) int {
  return a*3
})
//{3, 6, 9, 12, 15, 18, 21, 24, 27}

可以用於結構體

var list = []Employee{
    {"Hao", 44, 0, 8000},
    {"Bob", 34, 10, 5000},
    {"Alice", 23, 5, 9000},
    {"Jack", 26, 0, 4000},
    {"Tom", 48, 9, 7500},
}

result := TransformInPlace(list, func(e Employee) Employee {
    e.Salary += 1000
    e.Age += 1
    return e
})

健壯版的 Generic Reduce

同樣，泛型版的 Reduce 程式碼如下：

func Reduce(slice, pairFunc, zero interface{}) interface{} {
  sliceInType := reflect.ValueOf(slice)
  if sliceInType.Kind() != reflect.Slice {
    panic("reduce: wrong type, not slice")
  }

  len := sliceInType.Len()
  if len == 0 {
    return zero
  } else if len == 1 {
    return sliceInType.Index(0)
  }

  elemType := sliceInType.Type().Elem()
  fn := reflect.ValueOf(pairFunc)
  if !verifyFuncSignature(fn, elemType, elemType, elemType) {
    t := elemType.String()
    panic("reduce: function must be of type func(" + t + ", " + t + ") " + t)
  }

  var ins [2]reflect.Value
  ins[0] = sliceInType.Index(0)
  ins[1] = sliceInType.Index(1)
  out := fn.Call(ins[:])[0]

  for i := 2; i < len; i++ {
    ins[0] = out
    ins[1] = sliceInType.Index(i)
    out = fn.Call(ins[:])[0]
  }
  return out.Interface()
}

健壯版的 Generic Filter

同樣，泛型版的 Filter 程式碼如下（同樣分是否“就地計算”的兩個版本）：

func Filter(slice, function interface{}) interface{} {
  result, _ := filter(slice, function, false)
  return result
}

func FilterInPlace(slicePtr, function interface{}) {
  in := reflect.ValueOf(slicePtr)
  if in.Kind() != reflect.Ptr {
    panic("FilterInPlace: wrong type, " +
      "not a pointer to slice")
  }
  _, n := filter(in.Elem().Interface(), function, true)
  in.Elem().SetLen(n)
}

var boolType = reflect.ValueOf(true).Type()

func filter(slice, function interface{}, inPlace bool) (interface{}, int) {

  sliceInType := reflect.ValueOf(slice)
  if sliceInType.Kind() != reflect.Slice {
    panic("filter: wrong type, not a slice")
  }

  fn := reflect.ValueOf(function)
  elemType := sliceInType.Type().Elem()
  if !verifyFuncSignature(fn, elemType, boolType) {
    panic("filter: function must be of type func(" + elemType.String() + ") bool")
  }

  var which []int
  for i := 0; i < sliceInType.Len(); i++ {
    if fn.Call([]reflect.Value{sliceInType.Index(i)})[0].Bool() {
      which = append(which, i)
    }
  }

  out := sliceInType

  if !inPlace {
    out = reflect.MakeSlice(sliceInType.Type(), len(which), len(which))
  }
  for i := range which {
    out.Index(i).Set(sliceInType.Index(which[i]))
  }

  return out.Interface(), len(which)
}

後記

還有幾個未盡事宜：

1）使用反射來做這些東西，會有一個問題，那就是程式碼的效能會很差。所以，上面的程式碼不能用於你需要高效能的地方。怎麼解決這個問題，我們會在本系列文章的下一篇文章中討論。

2）上面的程式碼大量的參考了 Rob Pike的版本，他的程式碼在 github.com/robpike/filter

3）其實，在全世界範圍內，有大量的程式設計師都在問Go語言官方什麼時候在標準庫中支援 Map/Reduce，Rob Pike說，這種東西難寫嗎？還要我們官方來幫你們寫麼？這種程式碼我多少年前就寫過了，但是，我從來一次都沒有用過，我還是喜歡用“For迴圈”，我覺得你最好也跟我一起用 “For迴圈”。

我個人覺得，Map/Reduce在資料處理的時候還是很有用的，Rob Pike可能平時也不怎麼寫“業務邏輯”的程式碼，所以，對他來說可能也不太瞭解業務的變化有多麼的頻繁……

當然，好還是不好，由你來判斷，但多學一些程式設計模式是對自己的幫助也是很有幫助的。

（全文完）本文非本人所作，轉載左耳朵耗子部落格和出處酷殼 – CoolShell

本作品採用《CC 協議》，轉載必須註明作者和本文連結

滴水穿石，石破天驚----馬乂

GO程式設計模式05：MAP-REDUCE
2020-12-30
Go程式設計設計模式
六、GO 程式設計模式：GO GENERATION
2022-02-08
Go程式設計設計模式
十、GO程式設計模式：泛型程式設計
2022-02-08
Go程式設計設計模式泛型
八、GO程式設計模式：PIPELINE
2022-02-08
Go程式設計設計模式
Go 實現常用設計模式（五）觀察者模式
2019-09-07
Go設計模式
三、GO 程式設計模式：FUNCTIONAL OPTIONS
2022-02-06
Go程式設計設計模式Function
Go的程式設計模式二——funOption
2021-09-18
Go程式設計設計模式
七、GO 程式設計模式：修飾器
2022-02-08
Go程式設計設計模式
Go的程式設計模式一-管道Pipeline
2021-09-18
Go程式設計設計模式
九、GO 程式設計模式：K8S VISITOR 模式
2022-02-08
Go程式設計設計模式K8S
go設計模式之外觀模式
2019-02-27
Go設計模式
二、GO 程式設計模式：錯誤處理
2022-02-06
Go程式設計設計模式
Javascript設計模式（五）代理模式
2018-11-15
JavaScript設計模式
go設計模式之原型模式
2018-11-30
Go設計模式原型
Java設計模式之（五）——代理模式
2021-11-22
Java設計模式
設計模式（五）：介面卡模式
2021-09-28
設計模式
Go 設計模式之觀察者模式
2019-04-05
Go設計模式
Go 設計模式之裝飾器模式
2019-04-05
Go設計模式
Go 實現常用設計模式（九）模式
2019-09-07
Go設計模式
[Design Pattern With Go]設計模式-工廠模式
2021-03-29
Go設計模式
JS設計模式五：職責鏈模式
2018-12-05
JS設計模式
Java 設計模式（五）《裝飾器模式》
2018-08-23
Java設計模式
設計模式(五)Builder構建者模式
2019-04-20
設計模式UI
一、Go程式設計模式：切片，介面，時間和效能
2022-02-06
Go程式設計設計模式
四、GO程式設計模式：委託和反轉控制
2022-02-06
Go程式設計設計模式
通過 SingleFlight 模式學習 Go 併發程式設計
2022-04-24
模式Go程式設計
【go網路程式設計】-HTTP程式設計
2019-02-16
Go程式設計HTTP
Go程式設計模式三—Fan-Out模式與協程池結合
2021-11-16
Go程式設計設計模式
go程式設計題
2024-06-07
Go程式設計
Go 實現常用設計模式（九）代理模式
2019-09-07
Go設計模式
Go 實現常用設計模式（二）策略模式
2019-09-07
Go設計模式
Go 實現常用設計模式（三）策略模式
2019-09-07
Go設計模式
設計模式第五講-介面卡模式
2019-01-14
設計模式
設計模式快速學習（五）原型模式
2019-01-08
設計模式原型
MFC程式設計（五）
2020-10-30
C程式程式設計
shell程式設計五
2024-09-29
程式設計
設計模式--原型模式及其程式設計思想
2024-11-27
設計模式原型程式設計
Go語言設計模式彙總
2019-07-18
Go設計模式