Spark 異常:Trying to write more fields than contained in row
將json轉為row落地儲存為parquet:
-
for type_name in types.value:
-
print(type_name)
-
type_data_set = lines.filter(lambda line: line['type'] == type_name)
-
type_row = type_data_set.map(lambda line: Row(**line))
-
schema_row = self.sqlContext.createDataFrame(type_row)
-
-
schema_row.write.mode('overwrite').parquet(
-
'hdfs://ip:port/parquet/%s/year=%s/month=%s/day=%s/hour=%s' % \
-
(type_name, self.year, self.month, self.day, self.hour)
- )
異常:
-
Caused by: java.lang.IndexOutOfBoundsException: Trying to write more fields than contained in row (15 > 12)
-
at org.apache.spark.sql.execution.datasources.parquet.MutableRowWriteSupport.write(ParquetTableSupport.scala:261)
-
at org.apache.spark.sql.execution.datasources.parquet.MutableRowWriteSupport.write(ParquetTableSupport.scala:257)
-
at org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:121)
-
at org.apache.parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:123)
-
at org.apache.parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:42)
-
at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.writeInternal(ParquetRelation.scala:99)
-
at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:242)
- ... 8 more
同為zi型別的兩條記錄一條12個欄位,另一條15個欄位
-
{"time":"2016-06-06 17:25:14","message":{"channel":3,"containerId":"16","sendUserId":"2611","objectName":"RC:TxtMsg","count":49,"type":"zi","uuid":"-1","appId":"100000","nodeId":"GRM_NODE_0","userId":"2611","time":1465205114814,"ipAddress":"0","sdkVersion":"2.6.2","osName":"Android","deviceId":"0"}}
- {"time":"2016-06-06 17:41:31","message":{"channel":0,"count":0,"type":"zi","uuid":"","appId":"100000","nodeId":"MSG_NODE_2","userId":"2626","time":1465206091272,"ipAddress":"0","sdkVersion":"2.6.1","osName":"0","deviceId":"1"}}
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29754888/viewspace-2119617/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Java新增構造方法異常異常——Could not autowire. there is more than one bean of '' typeJava構造方法Bean
- 異常解決——The server time zone value 'Öйú±ê׼ʱ¼ä' is unrecognized or represents more than oneServerZed
- DNS: More than just namesDNS
- More than one file was found with OS independent path
- Spark 線性迴歸模型異常Spark模型
- OpenKruise 2021 規劃曝光:More than workloadsUI
- android: ADB錯誤“more than one device and emulator”Androiddev
- Clock in a Linux Guest Runs More Slowly or Quickly Than Real TimeLinuxUI
- “Validation failed for one or more entities”異常的解決辦法AI
- 記JPA 儲存資料異常: Row was updated or deleted by another transactiondelete
- JDBC:The server time zone value '�й���ʱ��' is unrecognized or represents more than one time zone.JDBCServerZed
- ORA-00494: enqueue [CF] held for too long (more than 900 seconds)ENQ
- spark讀取hdfs資料本地性異常Spark
- 寫了個web伺服器,ab測試一下,就報錯Conn.Write wrote more than the declared Content-LengthWeb伺服器
- ORA-600 [13011] when trying to delete a row-28184.1delete
- ORA-00494: enqueue [CF] held for too long (more than 900 seconds) -RACENQ
- Spark報錯(二):關於Spark-Streaming官方示例wordcount執行異常Spark
- Spark Shuffle Write階段磁碟檔案分析Spark
- mysql執行報The server time zone value 'Öйú±ê׼ʱ¼ä' is unrecognized or represents more than one tim......MySqlServerZed
- Granules of pga_aggregate_target 494 cannot be more than memory_target (497)
- [OOD-More C++ Idioms] 寫時拷貝 (Copy on Write)C++
- Spring中關於SqlRowSet的Invalid scale size. Cannot be less than zero異常處理SpringSQL
- 異常篇——異常處理
- 異常和異常呼叫鏈
- 解析MYSQL BINLOG 二進位制格式(5)--WRITE_ROW_EVENTMySql
- auto assign ipv6 for more than /64 prefix, subnet for /112 #164
- Java 異常(二) 自定義異常Java
- Java checked異常和unchecked異常。Java
- FIELDS TERMINATED BY WHITESPACE & FIELDS TERMINATED BY x'09'
- CCAH-CCA-500-6題:You want YARN to launch no more than 16 containers per node.YarnAI
- 奇異的enq: TX - row lock contentionENQ
- 異常-編譯期異常和執行期異常的區別編譯
- 異常-throws的方式處理異常
- 異常處理與異常函式函式
- jmu-Java-06異常-01-常見異常Java
- hibernate異常之--count查詢異常
- Java 異常表與異常處理原理Java
- restframework 異常處理及自定義異常RESTFramework