spark讀取hive異常,java.lang.NoClassDefFoundError: org/apache/tez/dag/api/SessionNotRunning

遠方的眺望發表於2020-11-03

環境:
HDP2.6.4
Spark2.2.0
Hive1.2.1

背景:
使用spark程式碼讀取hive表資料,寫入clickhouse表,相同的程式碼在其他HDP叢集正常使用,更換新環境後,報以下異常:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/tez/dag/api/SessionNotRunning
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:529)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:133)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:904)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.tez.dag.api.SessionNotRunning
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 13 more

這個異常十分不理解,因為程式碼沒變,純粹的環境問題。
最終確認異常點,是hive-site.xml檔案的問題。

解決方案:
將HDP中spark2/conf目錄下的hive-site.xml檔案替換掉專案conf目錄下的hive-site.xml檔案,保證spark2執行時讀取的hive-site.xml和環境中spark2的檔案保持一致,即可。
cp /etc/spark2/conf/hive-site.xml ${project}/conf/dev/
${project}指專案所在路徑。

cloudera解決方法

相關文章