Caused by: org.apache.parquet.io.ParquetDecodingException: Can‘t read value in column [result, label

晨本布衣發表於2020-12-22

寫入hdfs是沒有問題,但是讀取的時候會報這個錯

Caused by: org.apache.parquet.io.ParquetDecodingException: Can't read value in column [result, label_id] INT64 at value 2678 out of 2678, 2678 out of 2678 in currentPage. repetition level: 1, definition level: 1

和這個錯

java.lang.IllegalArgumentException: Reading past RLE/BitPacking stream.

 

解決辦法:寫入的時候,加個配置就可以了

conf.set("spark.sql.parquet.writeLegacyFormat", "true");

如果設定為true,Spark或java將使用與Hive相同的約定來編寫Parquet資料。 

 

相關文章