IDEA WordCount jar包上傳spark除錯及排錯
Based on:
Mac os
Spark 2.4.3
(Spark running on a standalone mode reference blog : http://blog.itpub.net/69908925/viewspace-2644303/ )
scala 2.12.8
IDEA 2019
1 IDEA-File-Project Structure-Libarary-Scala SDK
select version 2.11.12
這處選擇的版本需要跟spark scala執行版本一致,預設的是本機裝的Scala版本2.12.8,spark上執行會報主類錯誤
2 新建project ,pom.xml新增依賴
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.ny.service</groupId> <artifactId>scala517</artifactId> <version>1.0</version> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core --> <dependencies> <!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library --> <!-- 以下dependency都要修改成自己的scala,spark,hadoop版本--> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>2.11.12</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.4.3</version> </dependency> </dependencies> <build> <!--程式主目錄,按照自己的路徑修改,如果有測試檔案還要加一個testDirectory--> <sourceDirectory>src/main/scala</sourceDirectory> <plugins> <plugin> <groupId>org.scala-tools</groupId> <artifactId>maven-scala-plugin</artifactId> <version>2.15.2</version> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> </execution> </executions> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>2.4.3</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> <!--<transformers>--> <!--<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">--> <!--<mainClass></mainClass>--> <!--</transformer>--> <!--</transformers>--> </configuration> </execution> </executions> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <configuration> <source>1.8</source> <target>1.8</target> </configuration> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-jar-plugin</artifactId> <configuration> <archive> <manifest> <addClasspath>true</addClasspath> <useUniqueVersions>false</useUniqueVersions> <classpathPrefix>lib/</classpathPrefix> <!--修改為自己的包名.類名,右鍵類->copy reference--> <mainClass>com.ny.service.WordCount</mainClass> </manifest> </archive> </configuration> </plugin> </plugins> </build> </project>
scala library 選擇spark中的Scala版本 2.11.12 也是目前支援的最近版本
org.apache.spark 也選擇2.11
否則會出現主類錯誤:
19/05/16 10:52:03 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:60010 (size: 22.9 KB, free: 366.3 MB)
19/05/16 10:52:03 INFO SparkContext: Created broadcast 0 from textFile at WordCount.scala:18
Exception in thread "main" java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: scala/runtime/java8/JFunction2$mcIII$sp
at com.nyc.WordCount$.main(WordCount.scala:24)
at com.nyc.WordCount.main(WordCount.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
如何檢視spark 中Scala版本號
進入路徑:
/usr/local/opt/spark-2.4.3/jars
3 word count測試指令碼
package com.ny.service import org.apache.spark.{SparkConf, SparkContext} object WordCount{ def main(args: Array[String]): Unit = { // 1 建立配置資訊 val conf = new SparkConf().setAppName("wc") // 2 建立spark context sc val sc = new SparkContext(conf) // 3 處理邏輯 //讀取檔案 val lines = sc.textFile(args(0)) //壓平 val words = lines.flatMap(_.split(" ")) //map val k2v = words.map((_,1)) val results = k2v.reduceByKey(_+_) //儲存資料 results.saveAsTextFile(args(1)) // 4 關閉連線 sc.stop() } }
4 打包
複製到spark家目錄下,因為standalone模式所以沒有啟動Hadoop叢集
nancylulululu:spark-2.4.3 nancy$ mv /Users/nancy/IdeaProjects/scala517/target/original-scala517-1.0.jar wc.jar
5 spark submit 執行
bin/spark-submit \ --class com.ny.service.WordCount \ --master spark://localhost:7077 \ ./wc.jar \ file:///usr/local/opt/spark-2.4.3/test/1test \ file:///usr/local/opt/spark-2.4.3/test/out
如果是Hadoop file改為hdfs檔案系統路徑
檢視執行結果檔案:
nancylulululu:out nancy$ ls _SUCCESSpart-00000part-00001 nancylulululu:out nancy$ cat part-00000 (scala,2) (hive,1) (mysql,1) (hello,5) (java,2)
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/69908925/viewspace-2644643/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- .Net for Spark 實現 WordCount 應用及除錯入坑詳解Spark除錯
- 使用Intellij IDEA遠端除錯Spark程式IntelliJIdea除錯Spark
- Spark原始碼編譯與匯入IDEA除錯Spark原始碼編譯Idea除錯
- 除錯spark原始碼除錯Spark原始碼
- Linux系統下執行Jar包以及idea遠端除錯jar包LinuxJARIdea除錯
- iOS 打包上傳 appStore 錯誤集iOSAPP
- Spark 1.5.0 遠端除錯Spark除錯
- Intellij IDEA除錯IntelliJIdea除錯
- Spark報錯(二):關於Spark-Streaming官方示例wordcount執行異常Spark
- 10個除錯和排錯的小建議除錯
- jar包上傳Maven中央倉庫吐血筆記JARMaven筆記
- Java安全之jar包除錯技巧JavaJAR除錯
- 用IDEA除錯Play工程Idea除錯
- python五種除錯或排錯的方法Python除錯
- spark學習筆記--Spark調優與除錯Spark筆記除錯
- IDEA、ECLIPSE遠端除錯IdeaEclipse除錯
- idea除錯按鈕的作用Idea除錯
- idea執行專案報錯找不到jar包IdeaJAR
- 使用IDEA遠端debug除錯Idea除錯
- 使用IDEA進行遠端除錯Idea除錯
- lua~IDEA中除錯lua指令碼Idea除錯指令碼
- 7.29 除錯及admin除錯
- phpMyAdmin的安裝及排錯PHP
- gdb除錯傳入引數除錯
- Spark程式設計環境搭建及WordCount例項Spark程式設計
- 除錯篇——除錯物件與除錯事件除錯物件事件
- spark報錯Spark
- 【Spark】Spark容錯機制Spark
- spark之 spark 2.2.0 Standalone安裝、wordCount演示Spark
- Idea除錯Rocketmq原始碼編譯執行Idea除錯MQ原始碼編譯
- Intellij IDEA除錯功能使用總結IntelliJIdea除錯
- 【IDEA】2020 斷點(BreakPoints)除錯Idea斷點除錯
- IDEA的debug除錯--基礎小白篇Idea除錯
- IntelliJ IDEA遠端除錯Elasticsearch6.1.2IntelliJIdea除錯Elasticsearch
- maven jar 釋出出錯MavenJAR
- samba原始碼安裝及除錯Samba原始碼除錯
- 動態除錯及LLDB技巧集合除錯LLDB
- spark-submit執行jar包報錯找不到類的解決方法SparkMITJAR