Spark 1.5.0 遠端除錯
Spark 1.5.0 遠端除錯
作者:搖擺少年夢
微訊號:zhouzhihubeyond
先決條件
- 已安裝好Spark叢集,本例子中使用的是spark-1.5.0. 安裝方法參見:http://blog.csdn.net/lovehuangjiaju/article/details/48494737
- 已經安裝好Intellij IDEA,本例中使用的是Intellij IDEA 14.1.4,具體安裝方法參見:http://blog.csdn.net/lovehuangjiaju/article/details/48577281
遠端除錯過程描述
-
開啟Intellij IDEA,File->New ->Project
-
選擇Scala,然後next
-
配置好JDK、Scala版本,填入專案名稱,然後Finish
4.匯入spark-assembly-1.5.0-hadoop2.4.0.jar
File->Prject Structure->Library
點”+”號->選擇JAVA
找到spark-1.5.0安裝目錄,選擇spark-assembly-1.5.0-hadoop2.4.0.jar,我的機器上jar包目錄為
/hadoopLearning/spark-1.5.0-bin-hadoop2.4/lib/spark-assembly-1.5.0-hadoop2.4.0.jar,然後Finish
最後點選“OK”完成匯入
5.關聯spark-1.5.0原始碼
在Extended Library中展開spark-assembly-1.5.0-hadoop2.4.0.jar
找到org->apache->spark
點開下面包中的任意原始檔,我在本機上選擇”SparkContext.class”檔案,預設情況下Intellij IDEA會為我們反編譯.class檔案,但原始碼裡面沒有註釋,可以選擇右上角的”Attach Sources”
選擇原始碼檔案目錄,我的機器上原始碼解壓在/hadoopLearning/spark-1.5.0目錄,完成後“OK”
完成後會提示根目錄
全部選擇後點選“OK”,此時顯示的不是反編譯後的程式碼,而是關聯原始碼後的程式碼,你會發現多了很多註釋
至此原始碼閱讀環境構建完畢。
6.啟動spark-1.5.0叢集
root@sparkmaster:/hadoopLearning/spark-1.5.0-bin-hadoop2.4/sbin# ./start-all.sh
7.修改spark-class指令碼
本機器上的spark-class指令碼位於/hadoopLearning/spark-1.5.0-bin-hadoop2.4/bin目錄
將指令碼中的內容
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">done < <(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"$RUNNER"</span> -<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">cp</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"$LAUNCH_CLASSPATH"</span> org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.launcher</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.Main</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"$@"</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
修改為
<code class="hljs bash has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">done</span> < <(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$RUNNER</span>"</span> -cp <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$LAUNCH_CLASSPATH</span>"</span> org.apache.spark.launcher.Main <span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$JAVA_OPTS</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$@</span>"</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
然後在命令列中執行下列語句 export JAVA_OPTS="$JAVA_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=5005"
- 建立用於測試的Spark應用程式
選擇專案中的src檔案,然後右鍵 New->Scala Class
然後選擇Object
命名為SparkWordCount,然後點選OK,輸入如下內容
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">import org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.SparkContext</span>._ import org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span>.{SparkConf, SparkContext} object SparkWordCount{ def main(args: Array[String]) { if (args<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.length</span> == <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>) { System<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.err</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.println</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Usage: SparkWordCount <inputfile> <outputfile>"</span>) System<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.exit</span>(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>) } val conf = new SparkConf()<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.setAppName</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"SparkWordCount"</span>) val sc = new SparkContext(conf) val file=sc<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.textFile</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"file:///hadoopLearning/spark-1.5.1-bin-hadoop2.4/README.md"</span>) val counts=file<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.flatMap</span>(line=>line<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.split</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" "</span>)) <span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.map</span>(word=>(word,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>)) <span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.reduceByKey</span>(_+_) counts<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.saveAsTextFile</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"file:///hadoopLearning/spark-1.5.1-bin-hadoop2.4/countReslut.txt"</span>) } }</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li></ul>
9 將Spark應用程式打包
選擇專案,File->Project Structure
選擇 Artifacts
點選“+”號,然後選擇”Jar”->”From modules with dependencies”
選擇SparkWordCount作為MainClass
Spark應用程式在執行是會自動載入spark-assembly-1.5.0-hadoop2.4.0.jar等jar包,為減少後期Jar包的體積,可以將spark-assembly-1.5.0-hadoop2.4.0.jar等jar包刪除,這樣打包時不會被打包進去。
完成後點選”OK”
再選擇”Build”->”Build Artifacts”
Action中選擇“Build”
編譯後在對應目錄中可以看到生成的jar包檔案,本機器上的目錄是:
/root/IdeaProjects/SparkRemoteDebugPeoject/out/artifacts/SparkRemoteDebugPeoject_jar
10 將程式碼利用spark-submit提交到叢集
<code class="hljs coffeescript has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-property" style="box-sizing: border-box;">@sparkmaster</span>:<span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/hadoopLearning/spark-1.5.0-bin-hadoop2.4/bin# ./spark-submit --master spark:/</span><span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/sparkmaster:7077 --class SparkWordCount --executor-memory 1g /root/IdeaProjects/SparkRemoteDebugPeoject/out/artifacts/SparkRemoteDebugPeoject_jar hdfs:/</span><span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/ns1/README.md hdfs:/</span>/ns1/SparkWordCountResult <span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">//</span>注意這一行語句 Listening <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> transport dt_socket at <span class="hljs-attribute" style="box-sizing: border-box; color: rgb(0, 136, 0);">address</span>: <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5005</span> </code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li></ul>
11 Intellij IDEA中配置遠端除錯
Run->Edit Configuration
找到Remote
點選”+“號,命名為Spark_Remote_Debug,其它配置預設,Intellij IDEA已為我們預設配置
完成後,點選OK
12 正式啟動遠端除錯
在原始碼中設定斷點,本例中選擇在SparkSubmit.scala檔案中設定斷點
然後按 F9
選擇Spark_Remote_Debug
Spark控制檯出現:Connected to the target VM, address: ‘localhost:5005’, transport: ‘socket’,如下圖
在Debugger上可以看到
程式在執行SparkSubmit原始碼中設定斷點處
至此,遠端除錯正式開始,請暢遊Spark原始碼吧
最後說明一下除錯引數:
參見:http://www.thebigdata.cn/QiTa/12370.html
<code class="hljs haml has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">-<span class="ruby" style="box-sizing: border-box;"><span class="hljs-constant" style="box-sizing: border-box;">Xdebug</span> -<span class="hljs-constant" style="box-sizing: border-box;">Xrunjdwp</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:transport=dt_socket</span>,server=y,suspend=y,address=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5005</span> </span>引數說明: -<span class="ruby" style="box-sizing: border-box;"><span class="hljs-constant" style="box-sizing: border-box;">Xdebug</span> 啟用除錯特性 </span>-<span class="ruby" style="box-sizing: border-box;"><span class="hljs-constant" style="box-sizing: border-box;">Xrunjdwp</span> 啟用<span class="hljs-constant" style="box-sizing: border-box;">JDWP</span>實現,包含若干子選項: </span>transport=dt_socket JPDA front-end和back-end之間的傳輸方法。dt_socket表示使用套接字傳輸。 address=5005 JVM在5005埠上監聽請求,這個設定為一個不衝突的埠即可。 server=y y表示啟動的JVM是被除錯者。如果為n,則表示啟動的JVM是偵錯程式。 suspend=y y表示啟動的JVM會暫停等待,直到偵錯程式連線上才繼續執行。suspend=n,則JVM不會暫停等待。</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>
轉載: http://blog.csdn.net/lovehuangjiaju/article/details/49227919
相關文章
- 使用Intellij IDEA遠端除錯Spark程式IntelliJIdea除錯Spark
- Pycharm遠端除錯PyCharm除錯
- PHPSTROM遠端除錯PHP除錯
- IDEA、ECLIPSE遠端除錯IdeaEclipse除錯
- pycharm 遠端除錯之二PyCharm除錯
- Xdebug+PhpStorm 遠端除錯PHPORM除錯
- debug技巧之遠端除錯除錯
- 本地除錯遠端服務除錯
- vscode遠端除錯c++VSCode除錯C++
- 使用IDEA遠端debug除錯Idea除錯
- Dapr 遠端除錯之 Nocalhost除錯
- VS - 打斷點/本地除錯/遠端除錯 問題斷點除錯
- phpstorm 遠端除錯 homstead 程式碼PHPORM除錯
- WebStorm遠端除錯Node.jsWebORM除錯Node.js
- Homestead+PhpStorm+Xdebug 遠端除錯PHPORM除錯
- 使用IDEA進行遠端除錯Idea除錯
- 使用Xdebug進行遠端除錯除錯
- vs搭建遠端除錯環境除錯
- vsc 如何除錯遠端python程式碼除錯Python
- vscode配置遠端linux系統除錯VSCodeLinux除錯
- 基於 Scrcpy 的遠端除錯方案除錯
- Pycharm同步遠端伺服器除錯PyCharm伺服器除錯
- 遠端除錯 Android 裝置網頁除錯Android網頁
- IntelliJ IDEA遠端除錯Elasticsearch6.1.2IntelliJIdea除錯Elasticsearch
- windows系統vscode遠端除錯MySQLWindowsVSCode除錯MySql
- Pycharm連線遠端伺服器並實現遠端除錯PyCharm伺服器除錯
- Android Studio怎麼遠端除錯裝置?Android除錯
- 智慧小程式檔案館——遠端除錯除錯
- 原來 Java 遠端除錯如此簡單Java除錯
- vscode+gdbserver遠端除錯ARM環境搭建VSCodeServer除錯
- 使用VSCode遠端除錯惡意Powershell指令碼VSCode除錯指令碼
- 使用 Eclipse 遠端除錯 Java 應用程式(mark)Eclipse除錯Java
- 在海思晶片上使用GDB遠端除錯晶片除錯
- windows上通過IDA遠端除錯linux程式Windows除錯Linux
- 使用Clion優雅的完全遠端自動同步和遠端除錯c++除錯C++
- 智慧小程式檔案館——再談遠端除錯除錯
- 除錯環境 寶塔 mysql root 遠端登入除錯MySql
- UOS 開啟 VisualStudio 遠端除錯 .NET 應用之旅除錯
- 遠端除錯在Linux車機中的應用除錯Linux