Akka Stream文件翻譯:Motivation

devos發表於2015-04-13

 


 

動機

Motivation

The way we consume services from the internet today includes many instances of streaming data, both down- loading from a service as well as uploading to it or peer-to-peer data transfers. Regarding data as a stream of elements instead of in its entirety is very useful because it matches the way computers send and receive them (for example via TCP), but it is often also a necessity because data sets frequently become too large to be handled as a whole. We spread computations or analyses over large clusters and call it “big data”, where the whole principle of processing them is by feeding those data sequentially—as a stream—through some CPUs. 

現如今我們從因特網上獲取服務的方式包括了很多流式的資料,比如下載、上傳或是p2p(peer to peer)的資料傳輸。把資料視為元素(譯註:即整體的組成部分)的流而不是整體可以更有用,因為它符合計算機實際上處理它的方式(例如,通過TCP),但是經常這也是必需的,因為資料集經常變得太大而不能當作整體處理。我們把計算和分析分佈到一個大叢集中,稱之為“大資料”,它的處理原則就是把資料順序地(作為流)提供給一些CPU

Actors can be seen as dealing with streams as well: they send and receive series of messages in order to transfer knowledge (or data) from one place to another. We have found it tedious and error-prone to implement all the proper measures in order to achieve stable streaming between actors, since in addition to sending and receiving we also need to take care to not overflow any buffers or mailboxes in the process. Another pitfall is that Actor messages can be lost and must be retransmitted in that case lest the stream have holes on the receiving side. When dealing with streams of elements of a fixed given type, Actors also do not currently offer good static guarantees that no wiring errors are made: type-safety could be improved in this case. 

也可以認為actor處理的也是流:它們接收訊息、傳送訊息,來把知識(或者資料)從一個地方傳送到另一個地方。我們發現想要使用恰當的實現手段來在actor之間構造一個穩定的流非常繁雜、容易出錯,因為在接收和傳送之外,我們還得確保不會使得緩衝區和mailbox(actor的mailbox)溢位。另一個陷阱是,Actor的訊息可能會丟失,因此需要進行重傳,以免在流的接收端出現漏洞。有的流的元素是一個給定的型別,在處理這種情況是,actor並不能提供很好的靜態保證(譯註:指編譯器的型別檢查)來確保不發生交織時的錯誤(譯註:應該是指訊息流的編織),在這種情況下型別安全可以改進。

(譯註:這一段說明了設計Akka Stream的動機:

  1. 確保stream經過的各處的緩衝不會溢位(可以認為mailbox是actor的訊息緩衝)

  2. 保證訊息傳遞的可靠性,提供高於at-least-once的訊息傳遞語義。

  3. 在處理元素型別給定的流時,提供型別安全。

  這些問題在構造一個actor系統時,是非常核心的問題。特別是前兩個,自己構造actor系統時的確得采用很繁瑣的手段才能實現。actor系統存在的問題,可以參考下這篇文章

  Why I Don't Like Akka Actors

 )

For these reasons we decided to bundle up a solution to these problems as an Akka Streams API. The purpose is to offer an intuitive and safe way to formulate stream processing setups such that we can then execute them efficiently and with bounded resource usage—no more OutOfMemoryErrors. In order to achieve this our streams need to be able to limit the buffering that they employ, they need to be able to slow down producers if the consumers cannot keep up. This feature is called back-pressure and is at the core of the Reactive Streams initiative of which Akka is a founding member. For you this means that the hard problem of propagating and reacting to back-pressure has been incorporated in the design of Akka Streams already, so you have one less thing to worry about; it also means that Akka Streams interoperate seamlessly with all other Reactive Streams implementations (where Reactive Streams interfaces define the interoperability SPI while implementations like Akka Streams offer a nice user API). 

由於這些原因,我們想要把解決方案打包進Akka Stream API裡。目的是提供一個直觀和安全的方式來設計流處理過程,使得我們可以高效地執行它們,而且使用有界的資源消耗——不再有OutOfMemoryErrors。為了實現這點,我們的流需要能夠限制它們採用的緩衝大小,在消費者跟不上生產者時,我們需要能使用生產者慢下來。這個特性稱為後向壓力(back-pressure),它是Reactive Streams的核心提議,而Akka是Reactive Streams的創始成員。對你而言這意味著傳遞和應對back-pressure的問題已經被納入了Akka Stream的設計,所以你的擔心可以少一個了;這也意味著Akka Streams可以無縫地與其它Reactive Streams的實現(Reactive Streams介面定義了互操作的Service Provider Interface,而像Akka這樣的實現提供了一個很好地使用者API)互操作。

相關文章