Spark 1.5.0 遠端除錯

五柳-先生發表於2015-11-15

Spark 1.5.0 遠端除錯

作者:搖擺少年夢 
微訊號:zhouzhihubeyond

先決條件

  1. 已安裝好Spark叢集,本例子中使用的是spark-1.5.0. 安裝方法參見:http://blog.csdn.net/lovehuangjiaju/article/details/48494737
  2. 已經安裝好Intellij IDEA,本例中使用的是Intellij IDEA 14.1.4,具體安裝方法參見:http://blog.csdn.net/lovehuangjiaju/article/details/48577281

遠端除錯過程描述

  1. 開啟Intellij IDEA,File->New ->Project 
    這裡寫圖片描述

  2. 選擇Scala,然後next 
    這裡寫圖片描述

  3. 配置好JDK、Scala版本,填入專案名稱,然後Finish 
    這裡寫圖片描述

這裡寫圖片描述

4.匯入spark-assembly-1.5.0-hadoop2.4.0.jar

File->Prject Structure->Library 
這裡寫圖片描述

這裡寫圖片描述

點”+”號->選擇JAVA 
這裡寫圖片描述 
找到spark-1.5.0安裝目錄,選擇spark-assembly-1.5.0-hadoop2.4.0.jar,我的機器上jar包目錄為 
/hadoopLearning/spark-1.5.0-bin-hadoop2.4/lib/spark-assembly-1.5.0-hadoop2.4.0.jar,然後Finish 
這裡寫圖片描述

這裡寫圖片描述 
最後點選“OK”完成匯入

5.關聯spark-1.5.0原始碼 
在Extended Library中展開spark-assembly-1.5.0-hadoop2.4.0.jar 
這裡寫圖片描述 
找到org->apache->spark 
這裡寫圖片描述 
點開下面包中的任意原始檔,我在本機上選擇”SparkContext.class”檔案,預設情況下Intellij IDEA會為我們反編譯.class檔案,但原始碼裡面沒有註釋,可以選擇右上角的”Attach Sources” 
這裡寫圖片描述

選擇原始碼檔案目錄,我的機器上原始碼解壓在/hadoopLearning/spark-1.5.0目錄,完成後“OK” 
這裡寫圖片描述 
完成後會提示根目錄 
這裡寫圖片描述 
全部選擇後點選“OK”,此時顯示的不是反編譯後的程式碼,而是關聯原始碼後的程式碼,你會發現多了很多註釋 
這裡寫圖片描述

至此原始碼閱讀環境構建完畢。

6.啟動spark-1.5.0叢集 
root@sparkmaster:/hadoopLearning/spark-1.5.0-bin-hadoop2.4/sbin# ./start-all.sh 
這裡寫圖片描述

7.修改spark-class指令碼 
本機器上的spark-class指令碼位於/hadoopLearning/spark-1.5.0-bin-hadoop2.4/bin目錄 
將指令碼中的內容

<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">done < <(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"$RUNNER"</span> -<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">cp</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"$LAUNCH_CLASSPATH"</span> org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.launcher</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.Main</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"$@"</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>

修改為

<code class="hljs bash has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">done</span> < <(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$RUNNER</span>"</span> -cp <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$LAUNCH_CLASSPATH</span>"</span> org.apache.spark.launcher.Main <span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$JAVA_OPTS</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$@</span>"</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>

這裡寫圖片描述

然後在命令列中執行下列語句 
export JAVA_OPTS="$JAVA_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=5005"

這裡寫圖片描述

  1. 建立用於測試的Spark應用程式 
    選擇專案中的src檔案,然後右鍵 New->Scala Class 
    這裡寫圖片描述 
    然後選擇Object 
    這裡寫圖片描述 
    命名為SparkWordCount,然後點選OK,輸入如下內容
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">import org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.SparkContext</span>._
import org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span>.{SparkConf, SparkContext}

object SparkWordCount{
  def main(args: Array[String]) {
    if (args<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.length</span> == <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>) {
      System<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.err</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.println</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Usage: SparkWordCount <inputfile> <outputfile>"</span>)
      System<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.exit</span>(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>)
    }

    val conf = new SparkConf()<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.setAppName</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"SparkWordCount"</span>)
    val sc = new SparkContext(conf)

    val file=sc<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.textFile</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"file:///hadoopLearning/spark-1.5.1-bin-hadoop2.4/README.md"</span>)
    val counts=file<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.flatMap</span>(line=>line<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.split</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" "</span>))
      <span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.map</span>(word=>(word,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))
      <span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.reduceByKey</span>(_+_)
    counts<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.saveAsTextFile</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"file:///hadoopLearning/spark-1.5.1-bin-hadoop2.4/countReslut.txt"</span>)

  }
}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li></ul>

9 將Spark應用程式打包 
選擇專案,File->Project Structure 
這裡寫圖片描述 
選擇 Artifacts 
這裡寫圖片描述 
點選“+”號,然後選擇”Jar”->”From modules with dependencies” 
這裡寫圖片描述 
這裡寫圖片描述

選擇SparkWordCount作為MainClass 
這裡寫圖片描述

這裡寫圖片描述

Spark應用程式在執行是會自動載入spark-assembly-1.5.0-hadoop2.4.0.jar等jar包,為減少後期Jar包的體積,可以將spark-assembly-1.5.0-hadoop2.4.0.jar等jar包刪除,這樣打包時不會被打包進去。 
這裡寫圖片描述 
完成後點選”OK”

再選擇”Build”->”Build Artifacts” 
這裡寫圖片描述 
Action中選擇“Build” 
這裡寫圖片描述

編譯後在對應目錄中可以看到生成的jar包檔案,本機器上的目錄是: 
/root/IdeaProjects/SparkRemoteDebugPeoject/out/artifacts/SparkRemoteDebugPeoject_jar

這裡寫圖片描述

10 將程式碼利用spark-submit提交到叢集

<code class="hljs coffeescript has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-property" style="box-sizing: border-box;">@sparkmaster</span>:<span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/hadoopLearning/spark-1.5.0-bin-hadoop2.4/bin# ./spark-submit --master spark:/</span><span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/sparkmaster:7077 --class SparkWordCount --executor-memory 1g /root/IdeaProjects/SparkRemoteDebugPeoject/out/artifacts/SparkRemoteDebugPeoject_jar hdfs:/</span><span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/ns1/README.md hdfs:/</span>/ns1/SparkWordCountResult
<span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">//</span>注意這一行語句
Listening <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> transport dt_socket at <span class="hljs-attribute" style="box-sizing: border-box; color: rgb(0, 136, 0);">address</span>: <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5005</span>
</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li></ul>

這裡寫圖片描述

11 Intellij IDEA中配置遠端除錯 
Run->Edit Configuration 
這裡寫圖片描述 
找到Remote 
這裡寫圖片描述 
點選”+“號,命名為Spark_Remote_Debug,其它配置預設,Intellij IDEA已為我們預設配置 
這裡寫圖片描述 
完成後,點選OK

12 正式啟動遠端除錯 
在原始碼中設定斷點,本例中選擇在SparkSubmit.scala檔案中設定斷點 
這裡寫圖片描述

然後按 F9 
這裡寫圖片描述 
選擇Spark_Remote_Debug 
Spark控制檯出現:Connected to the target VM, address: ‘localhost:5005’, transport: ‘socket’,如下圖 
這裡寫圖片描述 
在Debugger上可以看到 
這裡寫圖片描述 
程式在執行SparkSubmit原始碼中設定斷點處 
這裡寫圖片描述

至此,遠端除錯正式開始,請暢遊Spark原始碼吧

最後說明一下除錯引數: 
參見:http://www.thebigdata.cn/QiTa/12370.html

<code class="hljs haml has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">-<span class="ruby" style="box-sizing: border-box;"><span class="hljs-constant" style="box-sizing: border-box;">Xdebug</span> -<span class="hljs-constant" style="box-sizing: border-box;">Xrunjdwp</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:transport=dt_socket</span>,server=y,suspend=y,address=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5005</span>
</span>引數說明:
-<span class="ruby" style="box-sizing: border-box;"><span class="hljs-constant" style="box-sizing: border-box;">Xdebug</span> 啟用除錯特性
</span>-<span class="ruby" style="box-sizing: border-box;"><span class="hljs-constant" style="box-sizing: border-box;">Xrunjdwp</span> 啟用<span class="hljs-constant" style="box-sizing: border-box;">JDWP</span>實現,包含若干子選項:
</span>transport=dt_socket JPDA front-end和back-end之間的傳輸方法。dt_socket表示使用套接字傳輸。
address=5005 JVM在5005埠上監聽請求,這個設定為一個不衝突的埠即可。
server=y y表示啟動的JVM是被除錯者。如果為n,則表示啟動的JVM是偵錯程式。
suspend=y y表示啟動的JVM會暫停等待,直到偵錯程式連線上才繼續執行。suspend=n,則JVM不會暫停等待。</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>

轉載: http://blog.csdn.net/lovehuangjiaju/article/details/49227919

相關文章