Akka 和 Storm 的設計差異

筍乾太鹹發表於2019-02-16

Akka 和 Storm 的設計差異

Akka 和 Storm 都是實現低延時, 高吞吐量計算的重要工具. 不過它們並非完全的競品,
如果說 Akka 是 linux 核心的話, storm 更像是類似 Ubuntu 的發行版.然而 Storm
並非 Akka 的發行版, 或許說 Akka 比作 BSD, Storm 比作 Ubuntu 更合適.

實現的功能差異

Akka 包括了一套 API 和執行引擎.
Storm 除了 API 和執行引擎之外,還包括了監控資料,WEB介面,叢集管理,訊息傳遞保障機制.
此文討論 Akka 和 Storm 重合的部分,也就是 API 和 執行引擎的異同.

API 差異

我們看下 Storm 兩個主要的 API

public interface ISpout extends Serializable {
    /**
     * Called when a task for this component is initialized within a worker on the cluster.
     * It provides the spout with the environment in which the spout executes.
     *
     * <p>This includes the:</p>
     *
     * @param conf The Storm configuration for this spout. This is the configuration provided to the topology merged in with cluster configuration on this machine.
     * @param context This object can be used to get information about this task`s place within the topology, including the task id and component id of this task, input and output information, etc.
     * @param collector The collector is used to emit tuples from this spout. Tuples can be emitted at any time, including the open and close methods. The collector is thread-safe and should be saved as an instance variable of this spout object.
     */
    void open(Map conf, TopologyContext context, SpoutOutputCollector collector);

    /**
     * Called when an ISpout is going to be shutdown. There is no guarentee that close
     * will be called, because the supervisor kill -9`s worker processes on the cluster.
     *
     * <p>The one context where close is guaranteed to be called is a topology is
     * killed when running Storm in local mode.</p>
     */
    void close();

    /**
     * Called when a spout has been activated out of a deactivated mode.
     * nextTuple will be called on this spout soon. A spout can become activated
     * after having been deactivated when the topology is manipulated using the
     * `storm` client.
     */
    void activate();

    /**
     * Called when a spout has been deactivated. nextTuple will not be called while
     * a spout is deactivated. The spout may or may not be reactivated in the future.
     */
    void deactivate();

    /**
     * When this method is called, Storm is requesting that the Spout emit tuples to the
     * output collector. This method should be non-blocking, so if the Spout has no tuples
     * to emit, this method should return. nextTuple, ack, and fail are all called in a tight
     * loop in a single thread in the spout task. When there are no tuples to emit, it is courteous
     * to have nextTuple sleep for a short amount of time (like a single millisecond)
     * so as not to waste too much CPU.
     */
    void nextTuple();

    /**
     * Storm has determined that the tuple emitted by this spout with the msgId identifier
     * has been fully processed. Typically, an implementation of this method will take that
     * message off the queue and prevent it from being replayed.
     */
    void ack(Object msgId);

    /**
     * The tuple emitted by this spout with the msgId identifier has failed to be
     * fully processed. Typically, an implementation of this method will put that
     * message back on the queue to be replayed at a later time.
     */
    void fail(Object msgId);
}

以及

public interface IBasicBolt extends IComponent {
    void prepare(Map stormConf, TopologyContext context);
    /**
     * Process the input tuple and optionally emit new tuples based on the input tuple.
     *
     * All acking is managed for you. Throw a FailedException if you want to fail the tuple.
     */
    void execute(Tuple input, BasicOutputCollector collector);
    void cleanup();
}

和 akka 中 actor 的 api

trait Actor {

  import Actor._

  // to make type Receive known in subclasses without import
  type Receive = Actor.Receive

  /**
   * Stores the context for this actor, including self, and sender.
   * It is implicit to support operations such as `forward`.
   *
   * WARNING: Only valid within the Actor itself, so do not close over it and
   * publish it to other threads!
   *
   * [[akka.actor.ActorContext]] is the Scala API. `getContext` returns a
   * [[akka.actor.UntypedActorContext]], which is the Java API of the actor
   * context.
   */
  implicit val context: ActorContext = {
    val contextStack = ActorCell.contextStack.get
    if ((contextStack.isEmpty) || (contextStack.head eq null))
      throw ActorInitializationException(
        s"You cannot create an instance of [${getClass.getName}] explicitly using the constructor (new). " +
          "You have to use one of the `actorOf` factory methods to create a new actor. See the documentation.")
    val c = contextStack.head
    ActorCell.contextStack.set(null :: contextStack)
    c
  }

  /**
   * The `self` field holds the ActorRef for this actor.
   * <p/>
   * Can be used to send messages to itself:
   * <pre>
   * self ! message
   * </pre>
   */
  implicit final val self = context.self //MUST BE A VAL, TRUST ME

  /**
   * The reference sender Actor of the last received message.
   * Is defined if the message was sent from another Actor,
   * else `deadLetters` in [[akka.actor.ActorSystem]].
   *
   * WARNING: Only valid within the Actor itself, so do not close over it and
   * publish it to other threads!
   */
  final def sender(): ActorRef = context.sender()

  /**
   * This defines the initial actor behavior, it must return a partial function
   * with the actor logic.
   */
  //#receive
  def receive: Actor.Receive
  //#receive

  /**
   * INTERNAL API.
   *
   * Can be overridden to intercept calls to this actor`s current behavior.
   *
   * @param receive current behavior.
   * @param msg current message.
   */
  protected[akka] def aroundReceive(receive: Actor.Receive, msg: Any): Unit = receive.applyOrElse(msg, unhandled)

  /**
   * Can be overridden to intercept calls to `preStart`. Calls `preStart` by default.
   */
  protected[akka] def aroundPreStart(): Unit = preStart()

  /**
   * Can be overridden to intercept calls to `postStop`. Calls `postStop` by default.
   */
  protected[akka] def aroundPostStop(): Unit = postStop()

  /**
   * Can be overridden to intercept calls to `preRestart`. Calls `preRestart` by default.
   */
  protected[akka] def aroundPreRestart(reason: Throwable, message: Option[Any]): Unit = preRestart(reason, message)

  /**
   * Can be overridden to intercept calls to `postRestart`. Calls `postRestart` by default.
   */
  protected[akka] def aroundPostRestart(reason: Throwable): Unit = postRestart(reason)

  /**
   * User overridable definition the strategy to use for supervising
   * child actors.
   */
  def supervisorStrategy: SupervisorStrategy = SupervisorStrategy.defaultStrategy

  /**
   * User overridable callback.
   * <p/>
   * Is called when an Actor is started.
   * Actors are automatically started asynchronously when created.
   * Empty default implementation.
   */
  @throws(classOf[Exception]) // when changing this you MUST also change UntypedActorDocTest
  //#lifecycle-hooks
  def preStart(): Unit = ()

  //#lifecycle-hooks

  /**
   * User overridable callback.
   * <p/>
   * Is called asynchronously after `actor.stop()` is invoked.
   * Empty default implementation.
   */
  @throws(classOf[Exception]) // when changing this you MUST also change UntypedActorDocTest
  //#lifecycle-hooks
  def postStop(): Unit = ()

  //#lifecycle-hooks

  /**
   * User overridable callback: ```By default it disposes of all children and then calls `postStop()`.```
   * @param reason the Throwable that caused the restart to happen
   * @param message optionally the current message the actor processed when failing, if applicable
   * <p/>
   * Is called on a crashed Actor right BEFORE it is restarted to allow clean
   * up of resources before Actor is terminated.
   */
  @throws(classOf[Exception]) // when changing this you MUST also change UntypedActorDocTest
  //#lifecycle-hooks
  def preRestart(reason: Throwable, message: Option[Any]): Unit = {
    context.children foreach { child ⇒
      context.unwatch(child)
      context.stop(child)
    }
    postStop()
  }

  //#lifecycle-hooks

  /**
   * User overridable callback: By default it calls `preStart()`.
   * @param reason the Throwable that caused the restart to happen
   * <p/>
   * Is called right AFTER restart on the newly created Actor to allow reinitialization after an Actor crash.
   */
  @throws(classOf[Exception]) // when changing this you MUST also change UntypedActorDocTest
  //#lifecycle-hooks
  def postRestart(reason: Throwable): Unit = {
    preStart()
  }
  //#lifecycle-hooks

  /**
   * User overridable callback.
   * <p/>
   * Is called when a message isn`t handled by the current behavior of the actor
   * by default it fails with either a [[akka.actor.DeathPactException]] (in
   * case of an unhandled [[akka.actor.Terminated]] message) or publishes an [[akka.actor.UnhandledMessage]]
   * to the actor`s system`s [[akka.event.EventStream]]
   */
  def unhandled(message: Any): Unit = {
    message match {
      case Terminated(dead) ⇒ throw new DeathPactException(dead)
      case _                ⇒ context.system.eventStream.publish(UnhandledMessage(message, sender(), self))
    }
  }
}

可以說 Storm 主要的 API 和 Actor 非常相像, 不過從時間線上看 Storm 和 Akka
都是從差不多的時間開始開發的,因此很有可能 Storm 是作者受了 Erlang 的 Actor 實現啟發而寫的.
從目前的狀況看來, 很有可能作者想用 Clojure 語言寫一個”樸素”的 Actor 實現, 然而這個”樸素”實現已經滿足了 Storm 的設計目標, 所以作者也沒有繼續把 Storm 變成一個 Actor 在 clojure 上的完整實現.

那麼,僅僅是從 API 上看的話 Spout/Bolt 和 Actor 的差異有哪些呢?

Storm API 比 Actor 多的地方

Storm 在 API 上比 Actor 多了 ack 和 fail 兩個介面. 有這兩個介面主要是因為 Storm 比 Akka 的應用場景更加細分(基本上只是用於統計), 所以已經做好了容錯機制,能讓在這個細分領域的使用者達到開箱可用.

另外,在 Storm 的 Tuple 類中儲存著一些 context 資訊,也是出於目標使用場景的需求封裝的.

Actor API 比 Storm 多的地方

context: Spout 的 open 方法裡也有 context, 然而 context 在 actor 中是隨時可以呼叫的,表明 Actor 比 Spout 更加鼓勵使用者使用 context, context 中的資料也會動態更新.

self: Actor對自身的引用,可以理解為 Actor 模型更加支援下游收到資料的元件往上游回發資料的行為,甚至自己對自己發資料也可以.在 Storm 中,我們預設資料傳送是單向的,下游接收的元件不會對上游有反饋(除了系統定義的ack,和fail)
postRestart: 區分 Actor 的第一次啟動和重啟, 還是蠻有用的,Storm 沒有應該是最初懶得寫或者沒想到,後來又不想改核心 API.

unhandled: 對沒有預期到會傳送給自身的訊息做處理,預設是傳到一個系統 stream,因為 Actor 本身是開放的,外部應用只要知道這個 Actor 的地址就能發訊息給它.Storm 本身只接收你為它設計好的訊息,所以沒有這個需求.

執行時差異

Actor 和 Task 的比較, 執行緒排程模型的不同, 以及程式碼熱部署,Storm 的 ack 機制對非同步程式碼的限制等.

Actor 和 Component 的比較

Component 是 Spout 和 Bolt 的總稱,是 Storm 中執行使用者程式碼的基本元件. 共同點是都根據訊息做出響應,也能夠儲存內容,一次只有一個執行緒進入,除非你手動另外開啟執行緒.主要的區別在於 Actor 是非常輕量的元件,你可以在一個程式裡建立幾萬個 Actor, 或者每十行程式碼都在一個 Actor 裡, 這都沒有問題. 然而換成 Storm 的Component, 情況都不一樣了,你最好只用若干個 Component 來描述頂層抽象.

執行緒排程模型

API 很相似,為什麼 Actor 可以隨便開新的, Component 就要儘量少開呢? 祕密都在 Akka 的排程器(Dispatchers)裡. Akka 程式的所有非同步程式碼,包括 Actor,Future,Runnable 甚至ParIterable,可以說除了你要用主執行緒啟動ActorSystem外,其他所有執行緒都可以交給Dispatcher管理.Dispatcher 可以自定義,預設的情況下采用了 “fork-join-executor”,相對於一般的執行緒池,fork-join-executor 特別適合 Actor模型,可以提供相當優異的效能.

相比較的, Storm 的執行緒排程模型就要”樸素”很多,就是每個 Component 一個執行緒,或者若干個Component輪流共用一個執行緒,這也就是為什麼Component不能開太多的原因.

程式碼熱部署

實時計算方面,熱部署的需求主要是諸如修改排序演算法之類的,替換某個演算法模組,其他東西不變.

因為 Storm 是可以通過 Thrift 支援任何語言程式設計的,所以你如果是用python之類的指令碼語言寫的演算法,想要換掉演算法而不重啟,那隻要把每臺機器上相應位置的py檔案替換掉就好了.不過這樣就會讓程式限定在用此類語言實現.

Akka 方面, 因為 Actor 模型對程式內和程式間的通訊介面都是統一的, 可以負責演算法的一類 Actor 作為單獨的程式啟動,程式碼更新了就重啟這個程式. 雖然系統中有一個程式重啟了,但是整個系統還是可以一刻不停地運轉.

Storm 中的 Ack 機制

Storm 的訊息保障機制是具有獨創性的, 利用位亦或能夠用非常小的記憶體,高效能地掌握資料處理過程中的成功或失敗情況. 預設的情況下,在使用者的程式碼中只需要指定一個MessageId, Ack 機制就能愉快地跑起來了. 所以通常使用者不用關心這塊內容, 但是預設介面的問題就是, 一旦使用了非同步程式, ack 機制就會失效,包括 schedule 和 submit runnable 等行為,都不會被 Ack 機制關心,也就是說非同步邏輯執行失敗了,acker也不知道. 如何能讓 Storm 的 Ack 機制與非同步程式碼和諧相處,還是一個待探討的問題.

總結

我認為 Storm 的 API 是優秀的, 可靠性也是在若干年的實踐中得到證實的, 然而其核心運轉機制過於樸素又給人一種烈士暮年的感覺. Storm 最初的使用者 Twitter 也在不久前公佈了他們相容 Storm 介面的新的解決方案 Heron, 不過並沒有開源. 如果有開源方案能夠基於 Akka “重新實現” 一個 Storm,那將是非常令人期待的事情. 我目前發現 gearpump是其中一個.

參考資料

相關文章