Spark Streaming 生產、消費流程梳理
SparkStreaming流程梳理
根據SparkStreaming的最初設計文件(https://docs.google.com/document/d/1vTCB5qVfyxQPlHuv8rit9-zjdttlgaSrMgfCDQlCJIM/edit#),初版的流程設計如下:
Reciever將block分發至ReceivedBlockHandler;
ReceivedBlockHandler將block儲存在記憶體(無冗餘);
Reciever將這個block傳輸至driver;
Reciever標記該block為recieved;
Driver基於block info資訊建立HDFSBackedBlockRDDs;
基於BlockManagerMaster的block location資訊進行排程;
Checkpoint資訊儲存在HDFS;
而當前穩定版本(2.1.0)的實現中,在多出新增了WAL功能,變更如下:
Reciever將block分發至ReceivedBlockHandler;
ReceivedBlockHandler將block儲存在記憶體(blockManager) + WAL中(無冗餘);
Reciever將這個blockInfo傳輸透過trackerEndpoint 傳輸至driver;
driver將該blockInfo寫入WAL;
Reciever標記該block為recieved;
Driver基於block info資訊建立HDFSBackedBlockRDDs(此處也有變更);
基於BlockManagerMaster的block location資訊進行排程;
Checkpoint資訊儲存在HDFS;
生產階段
ReceiverSupervisorImpl
ReceiverSupervisorImpl將蒐集的內容pushAndReportBlock儲存:
/** Store block and report it to driver */ def pushAndReportBlock( receivedBlock: ReceivedBlock, metadataOption: Option[Any], blockIdOption: Option[StreamBlockId] ) { // 構造blockId val blockId = blockIdOption.getOrElse(nextBlockId) val time = System.currentTimeMillis // 呼叫下述receivedBlockHandler的storeBlock方法,將block儲存至blockManager和wal val blockStoreResult = receivedBlockHandler.storeBlock(blockId, receivedBlock) logDebug(s"Pushed block $blockId in ${(System.currentTimeMillis - time)} ms") val numRecords = blockStoreResult.numRecords // 根據block資訊構造blockInfo val blockInfo = ReceivedBlockInfo(streamId, numRecords, metadataOption, blockStoreResult) // 傳輸該blockInfo至driver測的trackerEndpoint trackerEndpoint.askWithRetry[Boolean](AddBlock(blockInfo)) logDebug(s"Reported block $blockId") }
receivedBlockHandler
receivedBlockHandler為reciever測的實現wal功能,其主要功能為:將接受到的block並行地儲存在blockManger和HDFS中;
/** * This implementation stores the block into the block manager as well as a write ahead log. * It does this in parallel, using Scala Futures, and returns only after the block has * been stored in both places. */ // 基於Scala Future特質,可以並行地將RecivedBlock儲存到blockManager和HDFS def storeBlock(blockId: StreamBlockId, block: ReceivedBlock): ReceivedBlockStoreResult = { var numRecords = Option.empty[Long] // Serialize the block so that it can be inserted into both // 第一步、序列化block val serializedBlock = block match { case ArrayBufferBlock(arrayBuffer) => numRecords = Some(arrayBuffer.size.toLong) serializerManager.dataSerialize(blockId, arrayBuffer.iterator) case IteratorBlock(iterator) => val countIterator = new CountingIterator(iterator) val serializedBlock = serializerManager.dataSerialize(blockId, countIterator) numRecords = countIterator.count serializedBlock case ByteBufferBlock(byteBuffer) => new ChunkedByteBuffer(byteBuffer.duplicate()) case _ => throw new Exception(s"Could not push $blockId to block manager, unexpected block type") } // Store the block in block manager // 儲存在blockManager的future val storeInBlockManagerFuture = Future { val putSucceeded = blockManager.putBytes( blockId, serializedBlock, effectiveStorageLevel, tellMaster = true) if (!putSucceeded) { throw new SparkException( s"Could not store $blockId to block manager with storage level $storageLevel") } } // Store the block in write ahead log // 儲存到wal的future val storeInWriteAheadLogFuture = Future { // 當該函式該write函式完畢,保障該block一定成功地寫入hdfs writeAheadLog.write(serializedBlock.toByteBuffer, clock.getTimeMillis()) } // Combine the futures, wait for both to complete, and return the write ahead log record handle // 參考, 該方案使用zip,可以並行地完成上述兩者的執行 val combinedFuture = storeInBlockManagerFuture.zip(storeInWriteAheadLogFuture).map(_._2) val walRecordHandle = ThreadUtils.awaitResult(combinedFuture, blockStoreTimeout) WriteAheadLogBasedStoreResult(blockId, numRecords, walRecordHandle) }
關於trackerEndpoint
Reciver同Driver之間透過trackerEndpoint通訊,其處理上述的AddBlock資訊是在ReciverTracker類中實現,其具體實現如下:
override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = { // Remote messages case RegisterReceiver(streamId, typ, host, executorId, receiverEndpoint) => val successful = registerReceiver(streamId, typ, host, executorId, receiverEndpoint, context.senderAddress) context.reply(successful) case AddBlock(receivedBlockInfo) => if (WriteAheadLogUtils.isBatchingEnabled(ssc.conf, isDriver = true)) { // 呼叫receivedBlockTracker.addBlock實現,具體如下 walBatchingThreadPool.execute(new Runnable { override def run(): Unit = Utils.tryLogNonFatalError { if (active) { context.reply(addBlock(receivedBlockInfo)) } else { throw new IllegalStateException("ReceiverTracker RpcEndpoint shut down.") } } }) } else { context.reply(addBlock(receivedBlockInfo)) } case DeregisterReceiver(streamId, message, error) => deregisterReceiver(streamId, message, error) context.reply(true) // Local messages case AllReceiverIds => context.reply(receiverTrackingInfos.filter(_._2.state != ReceiverState.INACTIVE).keys.toSeq) case GetAllReceiverInfo => context.reply(receiverTrackingInfos.toMap) case StopAllReceivers => assert(isTrackerStopping || isTrackerStopped) stopReceivers() context.reply(true) } /** Add new blocks for the given stream */ private def addBlock(receivedBlockInfo: ReceivedBlockInfo): Boolean = { receivedBlockTracker.addBlock(receivedBlockInfo) } /** Add received block. This event will get written to the write ahead log (if enabled). */ // Driver測處理AddBlock事件 def addBlock(receivedBlockInfo: ReceivedBlockInfo): Boolean = { try { // 儲存blockInfo資訊,writeToLog會判定是否開啟wal, // 此處要注意: blockInfo資訊和在reciever測的block不一樣,一個你可以理解為block的meta資訊,一個則為真實的資料 val writeResult = writeToLog(BlockAdditionEvent(receivedBlockInfo)) if (writeResult) { synchronized { // 同時將該blockInfo寫入blockQueue,供排程使用 getReceivedBlockQueue(receivedBlockInfo.streamId) += receivedBlockInfo } logDebug(s"Stream ${receivedBlockInfo.streamId} received " + s"block ${receivedBlockInfo.blockStoreResult.blockId}") } else { logDebug(s"Failed to acknowledge stream ${receivedBlockInfo.streamId} receiving " + s"block ${receivedBlockInfo.blockStoreResult.blockId} in the Write Ahead Log.") } writeResult } catch { case NonFatal(e) => logError(s"Error adding block $receivedBlockInfo", e) false } }
消費階段
上述過程為透過reciever進行資料收集的階段,而產生的block則是透過spark排程任務進行消費的,其消費處理邏輯如下,首先經過JobGenerator每個batchTime生成相應的DStream,然後提交任務,進行處理。
/** Processes all events */ // JobGenerator啟動時,會啟動一個定時的timer,根據配置的batchDuration,定時地post GenerateJobs事件,觸發生成DStream的邏輯 private val timer = new RecurringTimer(clock, ssc.graph.batchDuration.milliseconds, longTime => eventLoop.post(GenerateJobs(new Time(longTime))), "JobGenerator") private def processEvent(event: JobGeneratorEvent) { logDebug("Got event " + event) event match { // eventLoop收到GenerateJobs事件 case GenerateJobs(time) => generateJobs(time) case ClearMetadata(time) => clearMetadata(time) case DoCheckpoint(time, clearCheckpointDataLater) => doCheckpoint(time, clearCheckpointDataLater) case ClearCheckpointData(time) => clearCheckpointData(time) } } //可以看做SparkStreaming的核心排程 /** Generate jobs and perform checkpointing for the given `time`. */ private def generateJobs(time: Time) { // Checkpoint all RDDs marked for checkpointing to ensure their lineages are // truncated periodically. Otherwise, we may run into stack overflows (SPARK-6847). ssc.sparkContext.setLocalProperty(RDD.CHECKPOINT_ALL_MARKED_ANCESTORS, "true") Try { // 第一步、分配上述“接受到的block”到該時間點對應的batch;具體實現如下。 jobScheduler.receiverTracker.allocateBlocksToBatch(time) // allocate received blocks to batch // 第二步,等上述分配好allocatedBlocks,呼叫generateJobs生成Spark定義的Job類(帶time引數) graph.generateJobs(time) // generate jobs using allocated block } match { case Success(jobs) => // 第三步、 根據time從inputInfoTracker獲取這次time的metaData(這一步沒弄明白,為什麼不從上述分配好的time->allocatedBlocks開始任務,而要加一個inputInfoTracker),並真正地提交任務,開始計算 val streamIdToInputInfos = jobScheduler.inputInfoTracker.getInfo(time) jobScheduler.submitJobSet(JobSet(time, jobs, streamIdToInputInfos)) case Failure(e) => jobScheduler.reportError("Error generating jobs for time " + time, e) PythonDStream.stopStreamingContextIfPythonProcessIsDead(e) } // 第四步、checkpoint該time至hdfs eventLoop.post(DoCheckpoint(time, clearCheckpointDataLater = false)) } /** Allocate all unallocated blocks to the given batch. */ // receiverTracker.allocateBlocksToBatch()會呼叫receivedBlockTracker類 def allocateBlocksToBatch(batchTime: Time): Unit = { if (receiverInputStreams.nonEmpty) { receivedBlockTracker.allocateBlocksToBatch(batchTime) } } /** * Allocate all unallocated blocks to the given batch. * This event will get written to the write ahead log (if enabled). */ def allocateBlocksToBatch(batchTime: Time): Unit = synchronized { if (lastAllocatedBatchTime == null || batchTime > lastAllocatedBatchTime) { // streamId為Reciever啟動時定義的streamId,呼叫getReceivedBlockQueue().dequeueAll(),將收集到的blockInfo返回,和streamId構成(streamId, blockInfos)的二元組 val streamIdToBlocks = streamIds.map { streamId => (streamId, getReceivedBlockQueue(streamId).dequeueAll(x => true)) }.toMap // 構造成AllocateBlocks物件,方便資料傳輸 val allocatedBlocks = AllocatedBlocks(streamIdToBlocks) // 在真正的任務開始前,將開始處理做的allocatedBlocks寫入wal if (writeToLog(BatchAllocationEvent(batchTime, allocatedBlocks))) { // 如果寫入成果,則開始分配任務,在time->allocatedBlocks新增該相應對,等待generateJob()使用 timeToAllocatedBlocks.put(batchTime, allocatedBlocks) lastAllocatedBatchTime = batchTime } else { // 如果寫入wal失敗,則需要重試 logInfo(s"Possibly processed batch $batchTime needs to be processed again in WAL recovery") } } else { // This situation occurs when: // 1. WAL is ended with BatchAllocationEvent, but without BatchCleanupEvent, // possibly processed batch job or half-processed batch job need to be processed again, // so the batchTime will be equal to lastAllocatedBatchTime. // 2. Slow checkpointing makes recovered batch time older than WAL recovered // lastAllocatedBatchTime. // This situation will only occurs in recovery time. logInfo(s"Possibly processed batch $batchTime needs to be processed again in WAL recovery") } }
GenerateJob如何生成RDD?
從Spark Streaming的定義來講,大家都熟悉Spark Streaming是一個批處理,將流轉換成離散的DStream。
但這個過程卻十分複雜,具體可以參考這個連結:原始碼解析系列/1.2%20DStream%20生成%20RDD%20例項詳解.md
後續
SparkStreaming的容錯機制有點繞,名字都叫wal,其實含義有些不同,後面會有一篇文章介紹其wal容錯機制,可以參考https://databricks.com/blog/2015/01/15/improved-driver-fault-tolerance-and-zero-data-loss-in-spark-streaming.html這篇文章,講解的挺詳細的;
作者:分裂四人組
連結:
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/4479/viewspace-2819379/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Spark streaming消費Kafka的正確姿勢SparkKafka
- 生產消費者模式模式
- 生產者消費者模式模式
- java的kafka生產消費JavaKafka
- 生產者消費者模型模型
- 食堂中的生產-消費模型模型
- 九、生產者與消費者模式模式
- 生產消費實現-寫程式碼
- python 生產者消費者模式Python模式
- ActiveMQ 生產者和消費者demoMQ
- kafka java 生產消費程式demo示例KafkaJava
- Java實現生產者和消費者Java
- 新手練習-消費者生產者模型模型
- Java實現生產者-消費者模型Java模型
- 生產者和消費者(.net實現)
- 雲小課|MRS資料分析-透過Spark Streaming作業消費Kafka資料SparkKafka
- java多執行緒之消費生產模型Java執行緒模型
- RocketMQ系列(三)訊息的生產與消費MQ
- java實現生產者消費者問題Java
- 阻塞佇列和生產者-消費者模式佇列模式
- linux 生產者與消費者問題Linux
- 多執行緒之生產者消費者執行緒
- 直觀理解生產者消費者問題
- Java 生產者消費者模式詳細分析Java模式
- kafka消費者消費訊息的流程Kafka
- RocketMQ之消費者啟動與消費流程MQ
- Java多執行緒——生產者消費者示例Java執行緒
- 生產者與消費者之Android audioAndroid
- Qt基於QSemaphore的生產者消費者模型QT模型
- Kafka 架構圖-輕鬆理解 kafka 生產消費Kafka架構
- java編寫生產者/消費者模式的程式。Java模式
- 併發設計模式---生產者/消費者模式設計模式
- 使用BlockQueue實現生產者和消費者模式BloC模式
- 使用Disruptor實現生產者和消費者模型模型
- Python中的生產者消費者問題Python
- JAVA執行緒消費者與生產者模型Java執行緒模型
- Spark學習進度11-Spark Streaming&Structured StreamingSparkStruct
- python中多程式消費者生產者問題Python