Spark 加入
Spark on Yarn
client
Cluster
本質區別,driver位置不同
1)有哪些不同得程式?
2)分別有什麼作用?
--master yarn
CoarseGrainedExecutorBackend 預設executor有兩個
CoarseGrainedExecutorBackend
SparkSubmit
ApplicationMaster
--master client
CoarseGrainedExecutorBackend 預設executor有兩個
CoarseGrainedExecutorBackend
SparkSubmit
ExecutorLauncher
Spark 原始碼流程
SparkSubmit.main{
呼叫: doSubmit
val submit = new SparkSubmit()
submit.doSubmit(args) {
val appArgs = parseArguments(args) {
mergeDefalutSparkProperties() // 使用default得引數
ignoreNonSparkProperties()
loadEnvironmentArguments(){
action = Option(action).getOrElse(SUBMIT)
}
}
submit(appArgs, uninitLog){
val (childArgs, childClasspath, sparkConf, childMainClass) = prepareSubmitElements{
if(deployMode == CLIENT){ // 如果選用client模式
childMainClass = args.mainClass // 使用者指定 類
}
if(deployMode == CLUSTER){ // 如果選用cluster模式
childMainClass = org.apache.spark.deploy.yarn.YarnClusterApplication // 固定類
}
}
var mainClass == Utils.classForName(childMainClass) // 使用主main
val app : SparkApplication
app.start(childArgs.toArray, SparkConf){
new Client(new ClientArguments(args),conf).run(){
this.appId = submitApplication() // 提交應用程式, log(: requesting a new application from cluster with 1 nodeManagers) ==== 》 獲取yarn得連結狀態,
val containerContext = createContainerLaunchContext(newAppResponse) // client.scala
{
val amClass =
if (isClusterMode) {
Utils.classForName("org.apache.spark,deploy.yarn.applicationMaster").getName // cluset模式是applicationmaster
}
else {
Utils.classForName("org.apache.spark.deploy.yarn.ExecutorLauncher").getName // client是executorLauncher
}
}
val appcontext = createApplicationSubmissionContext(newApp, containerContext) // client.scala
}
}
}
}
SparkSubmitArgument ==> 就Submit提交解析得引數包裝
applicationMaster:
ApplicationMaster.main{
master = new Applicationmaster(am)
master.run()
runImpl(){
if(isClusterMode){
runDriver(){
userClassTread = strartUserApplication(){
val mainMethod = userClassLoader.loadClass(args.userClass).getMethod("main",classOf[Array[String]])
mainMethod.invoke(null, userArgs.toArray)
}
val sc = ThreadUtils.awaitResult(SparkContextPromis.function)
registerAm(host,port,userConf,sc.ui.map(_.webUrl))
createAllocator(driverRef, userConf){
allocator.allocateResources(){
handleAllocatoedContainers(allocatedContainers.asScala){
runAllocatedContainers(containersToUse){
//啟動container
for(container <- containersToUse){ // 兩個container 啟動
launcherPool.execute(Thread).run(){
val commands = prepareCommand(){
org.apache.spark.executor.CoarseGrainedExecutorBackend
}
nmClient.startContainer(container.get, ctx)
}
}
}
}
}else{
runExecutorLauncher()
}
}
}
推薦閱讀:
https://blog.csdn.net/weixin_43866666/article/details/121743559
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/69949806/viewspace-2916024/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Spark之spark shellSpark
- Spark on Yarn 和Spark on MesosSparkYarn
- Spark系列 - (3) Spark SQLSparkSQL
- Spark學習進度-Spark環境搭建&Spark shellSpark
- 【Spark】Spark容錯機制Spark
- sparkSpark
- spark學習筆記--Spark SQLSpark筆記SQL
- spark學習筆記-- Spark StreamingSpark筆記
- Spark 系列(十四)—— Spark Streaming 基本操作Spark
- Spark 系列(十五)—— Spark Streaming 整合 FlumeSpark
- 【Spark篇】---Spark故障解決(troubleshooting)Spark
- Spark記錄(一):Spark全景概述Spark
- Spark SQL | Spark,從入門到精通SparkSQL
- spark2.2.0 配置spark sql 操作hiveSparkSQLHive
- Hello Spark! | Spark,從入門到精通Spark
- Spark 系列(九)—— Spark SQL 之 Structured APISparkSQLStructAPI
- Spark文件閱讀之一:Spark OverviewSparkView
- Spark學習筆記(三)-Spark StreamingSpark筆記
- Spark —— Spark OOM Error問題排查定位SparkOOMError
- spark with hiveSparkHive
- spark MapPartitionsRDDSparkAPP
- Spark StageSpark
- Spark & ZeppelinSpark
- Spark入門(四)--Spark的map、flatMap、mapToPairSparkAPTAI
- Spark in action on Kubernetes - Spark Operator的原理解析Spark
- Spark API 全集(1):Spark SQL Dataset & DataFrame APISparkAPISQL
- Spark SQL:4.對Spark SQL的理解SparkSQL
- Spark入門(五)--Spark的reduce和reduceByKeySpark
- Spark 系列(十一)—— Spark SQL 聚合函式 AggregationsSparkSQL函式
- Spark 以及 spark streaming 核心原理及實踐Spark
- 【Spark篇】---Spark中Shuffle檔案的定址Spark
- Spark Streaming + Spark SQL 實現配置化ETSparkSQL
- spark與kafaka整合workcount示例 spark-stream-kafkaSparkKafka
- Spark(十三) Spark效能調優之RDD持久化Spark持久化
- spark學習筆記--叢集執行SparkSpark筆記
- Spark之HiveSupport連線(spark-shell和IDEA)SparkHiveIdea
- 1.Spark學習(Python版本):Spark安裝SparkPython
- Spark Streaming監聽HDFS檔案(Spark-shell)Spark