Spark Stage
概念
A stage is a set of parallel tasks ① all computing the same function that need to run as part of a Spark job, where all the tasks have the same shuffle dependencies. Each DAG of tasks run by the scheduler is split up into stages at the boundaries where shuffle occurs, and then the DAGScheduler runs these stages in topological order.
Each Stage can ② either be a shuffle map stage, in which case its tasks' results are input for other stage(s), or a result stage, in which case its tasks directly compute a Spark action (e.g. count(), save(), etc) by running a function on an RDD. ③For shuffle map stages, we also track the nodes that each output partition is on.
Each Stage also has a firstJobId, identifying the job that first submitted the stage. When FIFO scheduling is used, this ④ allows Stages from earlier jobs to be computed first or recoveredfaster on failure.Finally, a single stage can be re-executed in multiple attempts due to fault recovery. In thatcase, the Stage object will track multiple StageInfo objects to pass to listeners or the web UI.The latest one will be accessible through latestInfo.
程式碼解讀
[DAGScheduler]->private[scheduler] def handleJobSubmitted
{ var finalStage: ResultStage = null try { /** ②stage 的型別只有兩種,一種是shuffle map stage 另一種是result stage,並且result stage 一定是呼叫action操作的RDD所在 的stage,引數含義:func-對每個分割槽進行的操作根據action的不同 而不同,例如action為count的時候那麼func就是計算每個分割槽的大小, 最終結果由jobwaiter(在SubmitJob方法中有涉及)蒐集並計算將func 的結果進行相加返回。 **/ finalStage = newResultStage(finalRDD, func, partitions, jobId, callSite) } catch { case e: Exception => logWarning("Creating new stage failed due to exception - job: " + jobId, e) listener.jobFailed(e) return } . . . /** [1]首次提交的一定是finalStage即resultStage,然後會遞迴 尋找該Stage的依賴直到找到一個沒有依賴的Stage才會生 成taskSet進行提交 submitStage(finalStage) [2]在遞迴尋找依賴stage的過程中如果發現當前stage有依 賴則將當前stage放入等待佇列中以便後續排程 **/ submitWaitingStages() }
[DAGScheduler]->private def submitStage(stage: Stage)
{ ... //[1] val missing = **getMissingParentStages(stage)**.sortBy(_.id) logDebug("missing: " + missing) if (missing.isEmpty) { logInfo("Submitting " + stage + " (" + stage.rdd + "), which has no missing parents”) //[1] submitMissingTasks(stage, jobId.get) } else { for (parent <- missing) { submitStage(parent) } //[2] waitingStages += stage } ... }
[DAGScheduler]->getMissingParentStages(stage: Stage): List[Stage]
根據dependency是否是shuffle dependency(wild or narrow)來進行stage劃分
{ . . . for (dep <- rdd.dependencies) { dep match { case shufDep: ShuffleDependency[_, _, _] => val mapStage = getShuffleMapStage(shufDep, stage.firstJobId) if (!mapStage.isAvailable) { missing += mapStage } case narrowDep: NarrowDependency[_] => [2] waitingForVisit.push(narrowDep.rdd) } } . . . }
ShuffleMapStage
概念
ShuffleMapStages are intermediate stages in the execution DAG that produce data for a shuffle.⑤They occur right before each shuffle operation, and might contain multiple pipelined operations before that (e.g. map and filter). When executed, ⑥they save map output files that can later be fetched by reduce tasks. ⑦The
shuffleDep
field describes the shuffle each stage is part of,and ⑧variables likeoutputLocs
andnumAvailableOutputs
track how many map outputs are ready.ShuffleMapStages can also be submitted independently as jobs with DAGScheduler.submitMapStage. For such stages, the ActiveJobs that submitted them are tracked inmapStageJobs
. ⑨Note that there can be multiple ActiveJobs trying to compute the same shuffle map stage.
程式碼解讀
⑤-在對stage進行劃分時,shuffle map stage 包含前個shuffle之後的所有非shuffle操作,如map、filter等。
⑥ 對每個partition的output 資訊進行維護
/** List of [[MapStatus]] for each partition. The index of the array is the map partition id,and each value in the array is the list of possible [[MapStatus]] for a partition(a single task might run multiple times). ③⑧當前rdd的位置及狀態資訊及每個partiton會在哪個executor 上執行併產生輸出。該資訊將用於DAG對task的排程. **/ private[this] val outputLocs = Array.fill[List[MapStatus]](numPartitions)(Nil)
[DAGScheduler]->submitMissingTasks(stage: Stage, jobId: Int)
... val partitionsToCompute: Seq[Int] = stage.findMissingPartitions() ... val taskIdToLocations: Map[Int, Seq[TaskLocation]] = try { stage match { case s: ShuffleMapStage => partitionsToCompute.map { id => (id, getPreferredLocs(stage.rdd, id))}.toMap case s: ResultStage => val job = s.activeJob.get partitionsToCompute.map { id => val p = s.partitions(id) (id, getPreferredLocs(stage.rdd, p)) }.toMap } } ...
⑦shuffleDep定義了整個shuffle的資訊,每個stage的shuffleDep變數則標識該stage屬於哪個shuffle應該執行怎麼樣的操作,
在提交執行stage時需要用到該資訊。
class ShuffleDependency[K: ClassTag, V: ClassTag, C: ClassTag]( @transient private val _rdd: RDD[_ <: Product2[K, V]], val partitioner: Partitioner, val serializer: Serializer = SparkEnv.get.serializer, val keyOrdering: Option[Ordering[K]] = None, val aggregator: Option[Aggregator[K, V, C]] = None, val mapSideCombine: Boolean = false){ ... }
作者:覺悟吧騷年
連結:
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/1762/viewspace-2818873/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Spark Job-Stage-Task例項理解Spark
- 【Spark篇】--Spark中的寬窄依賴和Stage的劃分Spark
- spark 原始碼分析之十九 -- Stage的提交Spark原始碼
- Spark學習(三)——Stage和寬窄依賴Spark
- spark task與stage數量過多調優Spark
- spark-stage任務劃分、sparkclient執行模式Sparkclient模式
- spark 原始碼分析之十九 -- DAG的生成和Stage的劃分Spark原始碼
- Stage模型深入解讀模型
- 鴻蒙HarmonyOS實戰-Stage模型(UIAbility元件)鴻蒙模型UI元件
- 鴻蒙HarmonyOS實戰-Stage模型(程序模型)鴻蒙模型
- Spark之spark shellSpark
- 【Spark篇】---Spark初始Spark
- Spark on Yarn 和Spark on MesosSparkYarn
- Spark系列 - (3) Spark SQLSparkSQL
- The 3rd Universal Cup. Stage 15: Chengdu
- 鴻蒙HarmonyOS實戰-Stage模型(ExtensionAbility元件)鴻蒙模型元件
- 鴻蒙HarmonyOS實戰-Stage模型(AbilityStage元件容器)鴻蒙模型元件
- The 1st Universal Cup. Stage 7: Zaporizhzhia
- Spark學習進度-Spark環境搭建&Spark shellSpark
- 【Spark】Spark容錯機制Spark
- Spark導論(Spark自學一)Spark
- sparkSpark
- FCOS: Fully Convolutional One-Stage Object DetectionObject
- 【小試牛刀】Stage-2 裝飾器初探
- Stage2 Part5:Grid Data StructuresStruct
- 從stage1裝gentoo基本系統(轉)
- The 3rd Universal Cup. Stage 8: Cangqian
- 鴻蒙HarmonyOS實戰-Stage模型(概述和元件配置)鴻蒙模型元件
- 鴻蒙HarmonyOS實戰-Stage模型(開發卡片事件)鴻蒙模型事件
- 鴻蒙HarmonyOS實戰-Stage模型(執行緒模型)鴻蒙模型執行緒
- spark學習筆記--Spark SQLSpark筆記SQL
- Spark記錄(一):Spark全景概述Spark
- Spark開發-Spark核心細說Spark
- Spark開發-spark環境搭建Spark
- 【cluvfy】叢集驗證工具cluvfy使用方法——stage
- ORA-600(kssadd_stage: null parent)錯誤Null
- gentoo簡單安裝手冊(stage3)(轉)
- 鴻蒙HarmonyOS實戰-Stage模型(開發卡片頁面)鴻蒙模型