Storm-原始碼分析-TopologySubmit-Task
mk-task, 比較簡單, 因為task只是概念上的結構, 不象其他worker, executor都需要建立程式或執行緒
所以其核心其實就是mk-task-data,
1. 建立TopologyContext物件, 其實就是把之前的topology物件和worker-data混合到一起, 便於task在執行時可以取到需要的topology資訊.
2. 建立task-object, spout-object或bolt-object, 封裝相應的邏輯, 如nextTuple, execute
3. 生成tasks-fn, 名字起的不好,讓人誤解執行了task的功能, 其實就是做些emit之間的準備工作, 其中最重要的就是呼叫grouper去產生targets task, 當然還包含些metrics, hooks的呼叫.
說白了其實mk-tasks, 沒做啥事
(defn mk-task [executor-data task-id] (let [task-data (mk-task-data executor-data task-id) ;;1 mk-task-data storm-conf (:storm-conf executor-data)] (doseq [klass (storm-conf TOPOLOGY-AUTO-TASK-HOOKS)] ;; add預定義的hooks (.addTaskHook ^TopologyContext (:user-context task-data) (-> klass Class/forName .newInstance))) ;; when this is called, the threads for the executor haven`t been started yet, ;; so we won`t be risking trampling on the single-threaded claim strategy disruptor queue (send-unanchored task-data SYSTEM-STREAM-ID ["startup"]) ;;向SYSTEM-STREAM, 傳送startup通知,誰會接收SYSTEM-STREAM…? task-data ))
1 mk-task-data
(defn mk-task-data [executor-data task-id] (recursive-map :executor-data executor-data :task-id task-id :system-context (system-topology-context (:worker executor-data) executor-data task-id) :user-context (user-topology-context (:worker executor-data) executor-data task-id) :builtin-metrics (builtin-metrics/make-data (:type executor-data)) :tasks-fn (mk-tasks-fn <>) :object (get-task-object (.getRawTopology ^TopologyContext (:system-context <>)) (:component-id executor-data))))
1.1 TopologyContext
Storm-原始碼分析-Topology Submit-Task-TopologyContext
:system-context, :user-context, 只是context中的topology物件不同, system為system-topology!
1.2 builtin-metrics/make-data
這裡的builtin-metrics用來記錄spout或bolt的執行狀況的metrics
1.3 mk-tasks-fn
返回tasks-fn, 這個函式主要用於做emit之前的準備工作, 返回target tasks list
1. 呼叫grouper, 產生target tasks
2. 執行emit hook
3. 滿足sampler條件時, 更新stats和buildin-metrics
task-fn, 兩種不同引數版本
[^String stream ^List values], 這個版本好理解些, 就是將stream對應的component的target tasks都算上(一個stream可能有多個out component, 一份資料需要發到多個bolt處理)
[^Integer out-task-id ^String stream ^List values], 指定out-task-id, 即direct grouping
這裡對out-task-id做了驗證
out-task-id (if grouping out-task-id), 即out-task-id->component->grouper不為nil(為:direct?), 即驗證這個stream確實有到該out-task-id對應component
如果驗證失敗, 將out-task-id置nil
(defn mk-tasks-fn [task-data] (let [task-id (:task-id task-data) executor-data (:executor-data task-data) component-id (:component-id executor-data) ^WorkerTopologyContext worker-context (:worker-context executor-data) storm-conf (:storm-conf executor-data) emit-sampler (mk-stats-sampler storm-conf) stream->component->grouper (:stream->component->grouper executor-data) ;;Storm-原始碼分析-Streaming Grouping user-context (:user-context task-data) executor-stats (:stats executor-data) debug? (= true (storm-conf TOPOLOGY-DEBUG))] (fn ([^Integer out-task-id ^String stream ^List values] (when debug? (log-message "Emitting direct: " out-task-id "; " component-id " " stream " " values)) (let [target-component (.getComponentId worker-context out-task-id) component->grouping (get stream->component->grouper stream) grouping (get component->grouping target-component) out-task-id (if grouping out-task-id)] (when (and (not-nil? grouping) (not= :direct grouping)) (throw (IllegalArgumentException. "Cannot emitDirect to a task expecting a regular grouping"))) (apply-hooks user-context .emit (EmitInfo. values stream task-id [out-task-id])) (when (emit-sampler) (builtin-metrics/emitted-tuple! (:builtin-metrics task-data) executor-stats stream) (stats/emitted-tuple! executor-stats stream) (if out-task-id (stats/transferred-tuples! executor-stats stream 1) (builtin-metrics/transferred-tuple! (:builtin-metrics task-data) executor-stats stream 1))) (if out-task-id [out-task-id]) )) ([^String stream ^List values] (when debug? (log-message "Emitting: " component-id " " stream " " values)) (let [out-tasks (ArrayList.)] (fast-map-iter [[out-component grouper] (get stream->component->grouper stream)] (when (= :direct grouper) ;; TODO: this is wrong, need to check how the stream was declared (throw (IllegalArgumentException. "Cannot do regular emit to direct stream"))) (let [comp-tasks (grouper task-id values)] ;;執行grouper, 產生target tasks (if (or (sequential? comp-tasks) (instance? Collection comp-tasks)) (.addAll out-tasks comp-tasks) (.add out-tasks comp-tasks) ))) (apply-hooks user-context .emit (EmitInfo. values stream task-id out-tasks)) ;;執行事先註冊的emit hook (when (emit-sampler) ;;滿足抽樣條件時, 更新stats和buildin-metrics中的emitted和transferred metric (stats/emitted-tuple! executor-stats stream) (builtin-metrics/emitted-tuple! (:builtin-metrics task-data) executor-stats stream) (stats/transferred-tuples! executor-stats stream (count out-tasks)) (builtin-metrics/transferred-tuple! (:builtin-metrics task-data) executor-stats stream (count out-tasks))) out-tasks))) ))
1.4 get-task-object
取出component的物件,
比如對於Spout, 取出SpoutSpec中的ComponentObject spout_object, 包含了spout的邏輯, 比如nextTuple()
(defn- get-task-object [^TopologyContext topology component-id] (let [spouts (.get_spouts topology) bolts (.get_bolts topology) state-spouts (.get_state_spouts topology) obj (Utils/getSetComponentObject (cond (contains? spouts component-id) (.get_spout_object ^SpoutSpec (get spouts component-id)) (contains? bolts component-id) (.get_bolt_object ^Bolt (get bolts component-id)) (contains? state-spouts component-id) (.get_state_spout_object ^StateSpoutSpec (get state-spouts component-id)) true (throw-runtime "Could not find " component-id " in " topology))) obj (if (instance? ShellComponent obj) (if (contains? spouts component-id) (ShellSpout. obj) (ShellBolt. obj)) obj ) obj (if (instance? JavaObject obj) (thrift/instantiate-java-object obj) obj )] obj ))
本文章摘自部落格園,原文釋出日期:2013-07-31
相關文章
- 搞定storm-入門ORM
- Retrofit原始碼分析三 原始碼分析原始碼
- 【JDK原始碼分析系列】ArrayBlockingQueue原始碼分析JDK原始碼BloC
- 集合原始碼分析[2]-AbstractList 原始碼分析原始碼
- 集合原始碼分析[3]-ArrayList 原始碼分析原始碼
- 集合原始碼分析[1]-Collection 原始碼分析原始碼
- Android 原始碼分析之 AsyncTask 原始碼分析Android原始碼
- Guava 原始碼分析之 EventBus 原始碼分析Guava原始碼
- 以太坊原始碼分析(36)ethdb原始碼分析原始碼
- 以太坊原始碼分析(38)event原始碼分析原始碼
- 以太坊原始碼分析(41)hashimoto原始碼分析原始碼
- 以太坊原始碼分析(43)node原始碼分析原始碼
- 以太坊原始碼分析(51)rpc原始碼分析原始碼RPC
- 以太坊原始碼分析(52)trie原始碼分析原始碼
- 深度 Mybatis 3 原始碼分析(一)SqlSessionFactoryBuilder原始碼分析MyBatis原始碼SQLSessionUI
- k8s client-go原始碼分析 informer原始碼分析(4)-DeltaFIFO原始碼分析K8SclientGo原始碼ORM
- k8s client-go原始碼分析 informer原始碼分析(6)-Indexer原始碼分析K8SclientGo原始碼ORMIndex
- 5.2 spring5原始碼--spring AOP原始碼分析三---切面原始碼分析Spring原始碼
- Spring原始碼分析——搭建spring原始碼Spring原始碼
- 精盡MyBatis原始碼分析 - MyBatis-Spring 原始碼分析MyBatis原始碼Spring
- 以太坊原始碼分析(35)eth-fetcher原始碼分析原始碼
- 以太坊原始碼分析(20)core-bloombits原始碼分析原始碼OOM
- 以太坊原始碼分析(24)core-state原始碼分析原始碼
- 以太坊原始碼分析(29)core-vm原始碼分析原始碼
- 以太坊原始碼分析(34)eth-downloader原始碼分析原始碼
- k8s client-go原始碼分析 informer原始碼分析(5)-Controller&Processor原始碼分析K8SclientGo原始碼ORMController
- ddos原始碼分析原始碼
- SpringBoot原始碼分析Spring Boot原始碼
- ucontext原始碼分析Context原始碼
- jQuery原始碼分析jQuery原始碼
- Express原始碼分析Express原始碼
- Eureka原始碼分析原始碼
- AbstractQueuedSynchronizer原始碼分析原始碼
- unbound原始碼分析原始碼
- Mybatis原始碼分析MyBatis原始碼
- apparmor 原始碼分析APP原始碼
- hadoop原始碼分析Hadoop原始碼
- JsBridge原始碼分析JS原始碼
- ThreadPoolExecutor原始碼分析thread原始碼