執行Spark服務,需要在每個節點上部署Spark。
可以先從主節點上將配置修改好,然後把Spark直接scp到其他目錄。
關鍵配置
修改conf/spark-env.sh
檔案:
export JAVA_HOME=/usr/java/latest
export HADOOP_CONF_DIR=/opt/hadoop-2.4.1/etc/hadoop/
export SPARK_MASTER_IP=master
以上是必要的幾個配置,詳細的配置說明,請參見官網的Document。
修改conf/slaves
,slave節點配置,將worker節點的主機名直接新增進去即可。
啟動叢集
sbin/start-all.sh
jps
檢視本機java程式,主節點應該有Master
程式,worker節點應該有個Worker
程式。
WebUI地址:http://master:8080
測試Spark,bin/run-example SparkPi
,正常的話,可以看到以下測試結果:
...
14/11/11 22:11:25 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 4052 ms on localhost (1/2)
14/11/11 22:11:25 INFO scheduler.DAGScheduler: Stage 0 (reduce at SparkPi.scala:35) finished in 4.130 s
14/11/11 22:11:25 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 69 ms on localhost (2/2)
14/11/11 22:11:25 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
14/11/11 22:11:25 INFO spark.SparkContext: Job finished: reduce at SparkPi.scala:35, took 4.613856515 s
Pi is roughly 3.1431
14/11/11 22:11:26 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/metrics/json,null}
14/11/11 22:11:26 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null}
14/11/11 22:11:26 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null}
14/11/11 22:11:26 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/static,null}
14/11/11 22:11:26 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/json,null}
...