Spark報錯(二):關於Spark-Streaming官方示例wordcount執行異常

桃花惜春風發表於2018-09-13

關於Spark-Streaming官方示例:
https://github.com/apache/spark/tree/master/examples

本文采用kafka作為spark輸入源
執行時出現以下日誌:

18/09/12 11:15:28 INFO JobScheduler: Added jobs for time 1536722117000 ms
18/09/12 11:15:28 INFO JobScheduler: Added jobs for time 1536722118000 ms
18/09/12 11:15:28 INFO JobScheduler: Added jobs for time 1536722119000 ms
18/09/12 11:15:28 INFO JobScheduler: Added jobs for time 1536722120000 ms
18/09/12 11:15:28 INFO JobScheduler: Added jobs for time 1536722121000 ms
18/09/12 11:15:28 INFO JobScheduler: Added jobs for time 1536722122000 ms
18/09/12 11:15:29 INFO JobScheduler: Added jobs for time 1536722123000 ms
18/09/12 11:15:29 INFO JobScheduler: Added jobs for time 1536722124000 ms
18/09/12 11:15:29 INFO JobScheduler: Added jobs for time 1536722125000 ms
18/09/12 11:15:29 INFO JobScheduler: Added jobs for time 1536722126000 ms
18/09/12 11:15:29 INFO JobScheduler: Added jobs for time 1536722127000 ms
18/09/12 11:15:29 INFO JobScheduler: Added jobs for time 1536722128000 ms
18/09/12 11:15:29 INFO JobScheduler: Added jobs for time 1536722129000 ms
18/09/12 11:15:30 INFO JobScheduler: Added jobs for time 1536722130000 ms
18/09/12 11:15:31 INFO JobScheduler: Added jobs for time 1536722131000 ms
18/09/12 11:15:32 INFO JobScheduler: Added jobs for time 1536722132000 ms
18/09/12 11:15:33 INFO JobScheduler: Added jobs for time 1536722133000 ms
18/09/12 11:15:34 INFO JobScheduler: Added jobs for time 1536722134000 ms
18/09/12 11:15:35 INFO JobScheduler: Added jobs for time 1536722135000 ms
18/09/12 11:15:36 INFO JobScheduler: Added jobs for time 1536722136000 ms
18/09/12 11:15:37 INFO JobScheduler: Added jobs for time 1536722137000 ms
18/09/12 11:15:38 INFO JobScheduler: Added jobs for time 1536722138000 ms
18/09/12 11:15:39 INFO JobScheduler: Added jobs for time 1536722139000 ms
18/09/12 11:15:40 INFO JobScheduler: Added jobs for time 1536722140000 ms
18/09/12 11:15:41 INFO JobScheduler: Added jobs for time 1536722141000 ms
18/09/12 11:15:42 INFO JobScheduler: Added jobs for time 1536722142000 ms
18/09/12 11:15:43 INFO JobScheduler: Added jobs for time 1536722143000 ms
18/09/12 11:15:44 INFO JobScheduler: Added jobs for time 1536722144000 ms
18/09/12 11:15:45 INFO JobScheduler: Added jobs for time 1536722145000 ms

很顯然這並非正常日誌。檢視kafka端消費正常後,確認是spark的問題。最後在官網看到一段話:
這裡寫圖片描述
簡單來說就是如果是本地執行,指定master不要指定local或local[1],應該設定為local[n],n>接收器數量。
如果是叢集模式執行,分配給Spark Streaming的核心數量必須大於接收者的數量。否則,spark就只能接受資料,無法處理資料了。


更多:Spark專欄
——————————————————————————————————
作者:桃花惜春風
轉載請標明出處,原文地址:
https://blog.csdn.net/xiaoyu_BD/article/details/82688001
如果感覺本文對您有幫助,您的支援是我堅持寫作最大的動力,謝謝!

相關文章