Spark 版本配套表
名稱 |
版本 |
說明 |
Spark |
spark-2.3.0-bin-hadoop2.7 |
Spark |
mongo-java-driver-3.5.0.jar |
3.5 |
Mongo驅動 |
mongo-spark-connector_2.11-2.3.1.jar |
2.3 |
Mongo connect驅動 |
Spark 與mongoDb版本不匹配,導致報錯
需要spark使用mongoDB驅動版本mongo-spark-connector到spark與mongoDB配套的版本
Spark dirver 節點與執行節點python版本不匹配
Exception: Python in worker has different version 2.7 than that in driver 3.5, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.
解決方法,配置 PYSPARK_PYTHON=/paic/spark/home/csmsopr/anaconda3/bin/python 環境變數
Hadoop目錄許可權問題
失敗日誌
2018-11-12 16:15:38 INFO SecurityManager:54 - Changing view acls to: csmsopr
2018-11-12 16:15:38 INFO SecurityManager:54 - Changing modify acls to: csmsopr
2018-11-12 16:15:38 INFO SecurityManager:54 - Changing view acls groups to:
2018-11-12 16:15:38 INFO SecurityManager:54 - Changing modify acls groups to:
2018-11-12 16:15:38 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(csmsopr); groups with view permissions: Set(); users with modify permissions: Set(csmsopr); groups with modify permissions: Set()
2018-11-12 16:15:38 INFO Client:54 - Submitting application application_1541659438825_0044 to ResourceManager
Traceback (most recent call last):
File "/lzp/submit_task.py", line 9, in <module>
sc = SparkContext()
File "/lzp/spark-2.3.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 118, in __init__
File "/lzp/spark-2.3.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 180, in _do_init
File "/lzp/spark-2.3.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 290, in _initialize_context
File "/lzp/spark-2.3.2-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1525, in __call__
File "/lzp/spark-2.3.2-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user/root/.sparkStaging/application_1541659438825_0024":csmsopr:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
解決方法
http://www.huqiwen.com/2013/07/18/hdfs-permission-denied/
最終,總結下來解決辦法大概有三種:
1、在系統的環境變數或java JVM變數裡面新增HADOOP_USER_NAME,這個值具體等於多少看自己的情況,以後會執行HADOOP上的Linux的使用者名稱。(修改完重啟eclipse,不然可能不生效)
2、將當前系統的帳號修改為hadoop
3、使用HDFS的命令列介面修改相應目錄的許可權,hadoop fs -chmod 777 /user,後面的/user是要上傳檔案的路徑,不同的情況可能不一樣,比如要上傳的檔案路徑為hdfs://namenode/user/xxx.doc,則這樣的修改可以,如果要上傳的檔案路徑為hdfs://namenode/java/xxx.doc,則要修改的為hadoop fs -chmod 777 /java或者hadoop fs -chmod 777 /,java的那個需要先在HDFS裡面建立Java目錄,後面的這個是為根目錄調整許可權。
Hadoop測試環境和生產環境配置區分
使用hadoop配置替換原有配置,docker中hadoop配置如何區分測試和生產,能否通過環境變數來配置
使用環境變數配置
不同環境配置不同的目錄
HADOOP_CONF_DIR=/app/hadoop_config/prd/
通過環境變數配置解決
Spark cluster提交任務賬戶不同
提交任務的client賬戶與叢集賬戶不同,通過環境變數來解決
不切換到csmsopr賬戶,在環境變數中配置即可 ENV HADOOP_USER_NAME="prdopr"
Spark 磁碟空間不足
https://www.cnblogs.com/itboys/p/6021838.html
2018-12-19 13:40:49,848 INFO 2018-12-19 13:40:49 WARN Client:87 - Failed to cleanup staging dir hdfs://governor/user/csmsopr/.sparkStaging/application_1545009795494_0018
2018-12-19 13:40:49,848 INFO org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot delete /user/csmsopr/.sparkStaging/application_1545009795494_0018. Name node is in safe mode.
2018-12-19 13:40:49,848 INFO Resources are low on NN. Please add or free up more resources then turn off safe mode manually. NOTE: If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
根據上面的報錯原因分析是因為叢集資源不夠,叢集的自我保護機制使hdfs處於安全模式,然後我用”hdfs dfsadmin -safemode leave“命令讓叢集恢復到可用模式但是在提交到叢集時還是會報錯同樣的錯誤
然後就查詢資料說的是節點空間不足,然後就用 df -hl命令檢視叢集空間的使用情況
看到上面的使用情況資源已經使用100%了
然後在使用du -sh /* 看看是拿些大檔案佔用了空間
然後把這些佔用空間大的檔案移動到別的地方然後重新提交任務,到此錯誤完美解決
Spark No space left on device
設定資料臨時目錄到其他目錄
Spark: java.io.IOException: No space left on device
SPARK_JAVA_OPTS+=" -Dspark.local.dir=/mnt/spark,/mnt2/spark -Dhadoop.tmp.dir=/mnt/ephemeral-hdfs"
export SPARK_JAVA_OPT
連結:
https://stackoverflow.com/questions/30162845/spark-java-io-ioexception-no-space-left-on-device