解決Spark讀取Hive分割槽表出現Input path does not exist的問題

StanZhai發表於2016-12-16

假設這裡出錯的表為test表。

現象

Hive讀取正常，不會報錯，Spark讀取就會出現：

org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://testcluster/user/hive/warehouse/....

在hive中執行desc formatted test;然後找到hdfs儲存路徑。然後hdfs dfs -ls <your table path>會發現，報錯的路徑確實不存在。

這也能說明為什麼Spark讀取會報錯，確實找不到那個路徑了。

問題分析

在hive中執行show partitions test，檢視test表對應的所有分割槽。

看了下之前新增過的所有分割槽，發現有個分割槽對應的hdfs目錄不存在了（被人為刪除了，一般是清理歷史資料）。但是這個分割槽並沒有執行alter table test drop partition p=xxx這樣刪除掉，因為即便是不刪除hive也能正常讀取。

但是到Spark這就不行了，Spark載入hive分割槽表資料會根據show partitions中的分割槽去載入，發現目錄缺失就會出錯了。

解決方案

刪除分割槽目錄時，同時執行alter table test drop partition (p='xxx')把對應的分割槽也刪除掉就可以了。

如果已經刪除了分割槽目錄，也需要執行上以上的命令。

Hive和Spark分割槽策略
2021-06-27
HiveSpark
Spark操作Hive分割槽表
2018-12-07
SparkHive
git中出現”the requested upstream branch ‘origin/master‘ does not exist“問題的解決
2020-12-17
GitAST
hive Sql的動態分割槽問題
2024-04-01
HiveSQL
Error: The directory named as part of the path ./log/supervisord.log does not exist解決方案
2020-12-12
Error
Laravel Class env does not exist 問題排查
2021-10-21
Laravel
Spark獲取當前分割槽的partitionId
2021-09-09
Spark
Hive動態分割槽詳解
2020-12-23
Hive
Hadoop的mapreduce出現問題，報錯The auxService:mapreduce_shuffle does not exist
2020-12-24
HadoopUX
Hive的靜態分割槽與動態分割槽
2018-05-03
Hive
Hive的分割槽和排序
2024-11-13
Hive排序
hive分割槽分桶
2021-02-26
Hive
Hive動態分割槽
2018-03-13
Hive
HIVE基本語法以及HIVE分割槽
2018-09-20
Hive
Spark SQL解析查詢parquet格式Hive表獲取分割槽欄位和查詢條件
2020-12-03
SparkSQLHive
The operation, ‘DecodeJpeg/contents‘, does not exist in the graph.錯誤解決方法
2020-10-30
Spark SQL中出現 CROSS JOIN 問題解決
2019-10-13
SparkSQLROS
[Hive]hive分割槽設定注意事項
2018-08-16
Hive
Spark RDD的預設分割槽數：（spark 2.1.0）
2021-09-09
Spark
Android解決The APK file app-debug.apk does not exist on disk.
2018-05-17
AndroidAPKAPP
Hive中靜態分割槽和動態分割槽總結
2021-03-31
Hive
windows下讀取Linux分割槽軟體
2022-09-20
WindowsLinux
重要 | Spark分割槽並行度決定機制
2020-11-19
Spark並行
PostgreSQL 原始碼解讀（96）- 分割槽表#3（資料插入路由#3-獲取分割槽鍵值）
2018-11-27
SQL原始碼路由
Spark學習——分割槽Partition數
2019-04-03
Spark
【Spark篇】---Spark解決資料傾斜問題
2018-03-04
Spark
Windows分割槽報錯解決
2024-05-12
Windows
【趙渝強老師】Hive的分割槽表
2024-10-28
Hive
hive迷案之消失的分割槽檔案
2021-09-09
Hive
PSQLexception: ERROR ： type "signed" does not exist
2022-08-04
SQLExceptionError
Property [title] does not exist on this collection instance
2020-07-09
Waring: /dev/centos/swap does not exist
2019-04-15
devCentOS
spark RDD textFile運算元分割槽數量詳解
2020-11-24
Spark
Apache Spark：分割槽和分桶 - Nivedita
2022-05-30
ApacheSpark
linux分割槽資料讀取工具：Paragon extFS for Mac
2023-12-12
LinuxGoMac
聊聊Spark的分割槽、並行度 —— 前奏篇
2020-11-17
Spark並行
ArcMap屬性表出現亂碼情況的解決
2023-10-27
hive分割槽和分桶你熟悉嗎？
2024-03-10
Hive
hive 動態分割槽插入資料表
2020-12-18
Hive

解決Spark讀取Hive分割槽表出現Input path does not exist的問題

現象

問題分析

解決方案

相關文章