Windows上搭建Standalone模式的Spark環境
Java
安裝Java8,設定JAVA_HOME,並新增 %JAVA_HOME%bin 到環境變數PATH中
E:java -versionjava version "1.8.0_60"Java(TM) SE Runtime Environment (build 1.8.0_60-b27)Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
Scala
下載解壓Scala ,設定SCALA_HOME,並新增 %SCALA_HOME%bin 到PATH中
E: scala -verion Scala code runner version 2.11.7 -- Copyright 2002-2013, LAMP/EPFL
Spark
下載解壓Spark , 設定SPARK_HOME,並新增 %SPARK_HOME%bin 到PATH中,此時嘗試在控制檯執行spark-shell
,出現如下錯誤提示無法定位winutils.exe 。
E:>spark-shell Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 17/06/05 21:34:43 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable nullbinwinutils.exe in the Hadoop binaries. at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:379) at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:394) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:387) at org.apache.hadoop.hive.conf.HiveConf$ConfVars.findHadoopBinary(HiveConf.java:2327) at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:365) at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:105) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:229) at org.apache.spark.sql.SparkSession$.hiveClassesArePresent(SparkSession.scala:991) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:92) at $line3.$read$$iw$$iw.<init>(<console>:15) at $line3.$read$$iw.<init>(<console>:42) at $line3.$read.<init>(<console>:44) at $line3.$read$.<init>(<console>:48) at $line3.$read$.<clinit>(<console>) at $line3.$eval$.$print$lzycompute(<console>:7) at $line3.$eval$.$print(<console>:6) at $line3.$eval.$print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637) at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19) at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565) at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807) at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681) at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:38) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37) at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37) at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:105) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:920) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909) at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97) at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909) at org.apache.spark.repl.Main$.doMain(Main.scala:69) at org.apache.spark.repl.Main$.main(Main.scala:52) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
從錯誤訊息中可以看出Spark需要用到Hadoop中的一些類庫(透過HADOOP_HOME環境變數,因為我們之前並未設定過,所以檔案路徑nullbinwinutils.exe
裡面出現了null),但這並不意味這我們一定要安裝Hadoop,我們可以直接下載所需要的到磁碟上的任何位置,比如C:winutilsbinwinutils.exe
,同時設定 HADOOP_HOME=C:winutils 。
現在我們再次執行spark-shell,又有一個新的錯誤:
java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState': at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981) at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110) at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109) at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878) at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:96) ... 47 elided Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog': at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:978) ... 58 more Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog': at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:169) at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:86) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101) at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100) at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:157) at org.apache.spark.sql.hive.HiveSessionState.<init>(HiveSessionState.scala:32) ... 63 more Caused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166) ... 71 more Caused by: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:358) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262) at org.apache.spark.sql.hive.HiveExternalCatalog.<init>(HiveExternalCatalog.scala:66) ... 76 more Caused by: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:188) ... 84 more Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508) ... 85 more <console>:14: error: not found: value spark import spark.implicits._ ^ <console>:14: error: not found: value spark import spark.sql ^ Welcome to ____ __ / __/__ ___ _____/ /__ _ / _ / _ `/ __/ '_/ /___/ .__/_,_/_/ /_/_ version 2.1.1 /_/ Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60) Type in expressions to have them evaluated. Type :help for more information. scala>
錯誤訊息中提示零時目錄 /tmp/hive 沒有寫的許可權:
The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: ---------
所以我們需要更新E:/tmp/hive
的許可權(我在E盤下執行的spark-shell命令,如果在其他盤執行,就改成對應的磁碟機代號+/tmp/hive)。執行如下命令:
E:>C:winutilsbinwinutils.exe chmod 777 E:tmphive
再次執行spark-shell,spark啟動成功。此時可以透過 來訪問Spark UI
作者:小魚愛小蝦
連結:
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/2618/viewspace-2818786/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- spark環境搭建Spark
- Spark開發-Standalone模式Spark模式
- Spark開發-spark環境搭建Spark
- Spark on Yarn 環境搭建SparkYarn
- 【Spark篇】--Spark中Standalone的兩種提交模式Spark模式
- Spark Standalone模式 高可用部署Spark模式
- Windows環境下的Nginx環境搭建WindowsNginx
- Spark開發-HA環境的搭建Spark
- Spark學習進度-Spark環境搭建&Spark shellSpark
- windows環境下Django環境搭建WindowsDjango
- 部署spark2.2叢集(standalone模式)Spark模式
- Flutter環境搭建(Windows)FlutterWindows
- go windows 環境搭建GoWindows
- git windows 環境搭建GitWindows
- windows上python3開發環境的搭建WindowsPython開發環境
- Spark-2.3.0環境搭建安裝Spark
- HADOOP SPARK 叢集環境搭建HadoopSpark
- 深入理解Spark 2.1 Core (五):Standalone模式Spark模式
- Spark Standalone模式 Master程式掛掉問題Spark模式AST
- Windows 下搭建 lnmp 環境WindowsLNMP
- Windows下搭建Solr環境WindowsSolr
- windows下搭建lisp環境WindowsLisp
- 在 Windows 上搭建 React Native IOS 開發環境WindowsReact NativeiOS開發環境
- Windows上搭建Android開發環境詳細教程WindowsAndroid開發環境
- spark 2.1.0 standalone模式配置&&打包jar包透過spark-submit提交Spark模式JARMIT
- Flutter系列(三)——環境搭建(Windows)FlutterWindows
- Windows搭建Superset環境學習Windows
- windows 前端工作環境搭建指北Windows前端
- GOLang開發環境搭建(Windows)Golang開發環境Windows
- Windows 下搭建 Homestead 環境Windows
- 【轉】 Windows下LAMP環境搭建WindowsLAMP
- windows開發genieacs環境搭建Windows
- Windows中搭建已存在的Octopress環境Windows
- 深入體驗bash on windows,在windows上搭建原生的linux開發環境,酷!WindowsLinux開發環境
- Windows下搭建ESP-IDF環境搭建Windows
- Spark程式設計環境搭建及WordCount例項Spark程式設計
- Hadoop2.7.3+Hive2.1.1+Spark2.1.0環境搭建HadoopHiveSpark
- spark開發環境搭建intellij+Scala+sbtSpark開發環境IntelliJ