Spark原始碼分析之BlockManager通訊機制
BlockManagerMasterEndpoint主要用於向BlockManagerSlaveEndpoint傳送訊息,主要分析他們都接受哪些訊息,接受到訊息之後怎麼處理?
一BlockManagerMasterEndpoint
首先它維護了3個重要對映:
維護一個<BlockManagerId,BlockManagerInfo>的對映
維護一個<ExecuotorId,BlockManagerId>的對映
維護一個<BlockId,Set<BlockManagerId>>對映,多個Block Manager Id包含這個blockId
1.1receiveAndReply接收訊息
//接收訊息並返回結果
override def receiveAndReply(context:RpcCallContext):
PartialFunction[Any, Unit] = {
// 註冊BlockManager
case RegisterBlockManager(blockManagerId,maxMemSize,
slaveEndpoint) =>
register(blockManagerId,maxMemSize,
slaveEndpoint)
context.reply(true)
// 更新block資訊
case _updateBlockInfo@
UpdateBlockInfo(
blockManagerId,
blockId, storageLevel, deserializedSize,size,
externalBlockStoreSize) =>
context.reply(updateBlockInfo(
blockManagerId,
blockId, storageLevel,
deserializedSize, size,
externalBlockStoreSize))
listenerBus.post(SparkListenerBlockUpdated(BlockUpdatedInfo(_updateBlockInfo)))
// 根據blockId獲取對應的所有BlockManagerId列表
case GetLocations(blockId) =>
context.reply(getLocations(blockId))
// 根據指定的blockId列表,返回多個blockId對應的BlockManagerId集合
case GetLocationsMultipleBlockIds(blockIds) =>
context.reply(getLocationsMultipleBlockIds(blockIds))
// 獲取指定的blockManagerId是Executor的BlockManager,且不包括指定blockManagerId
case GetPeers(blockManagerId) =>
context.reply(getPeers(blockManagerId))
// 根據executorId獲取RPC遠端主機和埠號
case GetRpcHostPortForExecutor(executorId) =>
context.reply(getRpcHostPortForExecutor(executorId))
// 獲取記憶體狀態
case GetMemoryStatus=>
context.reply(memoryStatus)
// 獲取儲存狀態
case GetStorageStatus=>
context.reply(storageStatus)
// 返回所有block manager的block狀態
case GetBlockStatus(blockId,askSlaves) =>
context.reply(blockStatus(blockId,askSlaves))
// 獲取與過濾條件相匹配的blockId
case GetMatchingBlockIds(filter,askSlaves) =>
context.reply(getMatchingBlockIds(filter,askSlaves))
// 刪除指定rdd對應的所有blocks
case RemoveRdd(rddId) =>
context.reply(removeRdd(rddId))
// 刪除該shuffle對應的所有block
case RemoveShuffle(shuffleId) =>
context.reply(removeShuffle(shuffleId))
// 刪除廣播資料對應的block
case RemoveBroadcast(broadcastId,removeFromDriver) =>
context.reply(removeBroadcast(broadcastId,removeFromDriver))
// 從worker節點(slave節點)刪除對應block
case RemoveBlock(blockId) =>
removeBlockFromWorkers(blockId)
context.reply(true)
// 試圖從BlockManagerMaster移除掉這個Executor
case RemoveExecutor(execId) =>
removeExecutor(execId)
context.reply(true)
// 停止StopBlockManagerMaster訊息
case StopBlockManagerMaster=>
context.reply(true)
stop()
// 傳送BlockManager心跳檢測訊息
case BlockManagerHeartbeat(blockManagerId) =>
context.reply(heartbeatReceived(blockManagerId))
// 判斷executorId對應的BlockManager是否有快取的block
case HasCachedBlocks(executorId) =>
blockManagerIdByExecutor.get(executorId)match
{
case Some(bm) =>
if (blockManagerInfo.contains(bm)) {
val bmInfo=
blockManagerInfo(bm)
context.reply(bmInfo.cachedBlocks.nonEmpty)
} else {
context.reply(false)
}
case None
=> context.reply(false)
}
}
1.2removeRdd 刪除該rdd對應的所有block
首先刪除和該rdd相關的後設資料資訊;然後再向BlockManager從節點傳送RemoveRdd進行具體的刪除
private def removeRdd(rddId: Int): Future[Seq[Int]] = {
// 將所有可以轉化為rdd的blockId轉化為rddId,然後過濾出和當前指定rddId相等的blocks
val blocks = blockLocations.keys.flatMap(_.asRDDId).filter(_.rddId == rddId)
// 遍歷和該rdd的blocks,從該block對應的BlockManager中刪除該block
// 並且blockLocations也要移除這個block
blocks.foreach { blockId =>
val bms: mutable.HashSet[BlockManagerId] = blockLocations.get(blockId)
bms.foreach(bm => blockManagerInfo.get(bm).foreach(_.removeBlock(blockId)))
blockLocations.remove(blockId)
}
// 然後通過BlockManagerSlaveEndpoint向slave傳送RemoveRdd訊息
val removeMsg = RemoveRdd(rddId)
Future.sequence(
blockManagerInfo.values.map { bm =>
bm.slaveEndpoint.ask[Int](removeMsg)
}.toSeq
)
}
1.3removeShuffle
只是向slave傳送RemoveShuffle訊息,讓slave去刪除shuffle相關的block
private def removeShuffle(shuffleId: Int): Future[Seq[Boolean]] = {
// 只是向slave傳送RemoveShuffle訊息,讓slave去刪除shuffle相關的block
val removeMsg = RemoveShuffle(shuffleId)
Future.sequence(
blockManagerInfo.values.map { bm =>
bm.slaveEndpoint.ask[Boolean](removeMsg)
}.toSeq
)
}
1.4removeBlockManager 刪除BlockManager
private def removeBlockManager(blockManagerId: BlockManagerId) {
// 根據blockManaerId獲取BlockInfo
val info = blockManagerInfo(blockManagerId)
// 從<ExecutorId,BlockManagerId>中移除diaper該block manager對應的executorId
blockManagerIdByExecutor -= blockManagerId.executorId
// 從<BlockManagerId,BlockMangerInfo>中移除掉這個BlockManager
blockManagerInfo.remove(blockManagerId)
// 遍歷該BlockManager所對應的所有block
val iterator = info.blocks.keySet.iterator
while (iterator.hasNext) {
// 獲取每一個blockId
val blockId = iterator.next
// 從<BlockId,Set<BlockManagerId>>對映中得到該block所對應的所有BlockManager
val locations = blockLocations.get(blockId)
// 所有BlockManager中移除當前要移除的blockManagerId
locations -= blockManagerId
// 移除完了之後,Set<BlockManagerId>大小,如果沒有資料了,則表示沒有對應的
// BlockManger與之對應,我們應該從<BlockId,Set<BlockManagerId>>移除這個blockId
if (locations.size == 0) {
blockLocations.remove(blockId)
}
}
listenerBus.post(SparkListenerBlockManagerRemoved(System.currentTimeMillis(), blockManagerId))
logInfo(s"Removing block manager $blockManagerId")
}
1.5removeBlockFromWorkers 從worker節點(slave節點)刪除對應block
private def removeBlockFromWorkers(blockId: BlockId) {
// 獲取該block所在的那些BlockManagerId的列表
val locations = blockLocations.get(blockId)
if (locations != null) {
// 遍歷blockManagerId列表,然後獲取每一個blockManagerId對應的BlockManager
// 如果這個BlockManager已經定義,則向slave節點傳送RemoveBlock訊息
locations.foreach { blockManagerId: BlockManagerId =>
val blockManager = blockManagerInfo.get(blockManagerId)
if (blockManager.isDefined) {
blockManager.get.slaveEndpoint.ask[Boolean](RemoveBlock(blockId))
}
}
}
}
1.6blockStatus 返回所有block manager的block狀態
private def blockStatus(blockId: BlockId,
askSlaves: Boolean): Map[BlockManagerId, Future[Option[BlockStatus]]] = {
// 建立GetBlockStatus物件
val getBlockStatus = GetBlockStatus(blockId)
// 遍歷註冊過的BlockManagerInfo,如果需要向slave查詢,則向BlockManagerSlaveEndpoint傳送BlockStatus訊息
// 否則將返回結果封裝Future中,最後將結果轉化成Map[BlockManagerId, Future[Option[BlockStatus]]]
blockManagerInfo.values.map { info =>
val blockStatusFuture =
if (askSlaves) {
info.slaveEndpoint.ask[Option[BlockStatus]](getBlockStatus)
} else {
Future { info.getStatus(blockId) }
}
(info.blockManagerId, blockStatusFuture)
}.toMap
}
1.7register 註冊
private def register(id: BlockManagerId, maxMemSize: Long, slaveEndpoint: RpcEndpointRef) {
val time = System.currentTimeMillis()
// 如果還沒有被註冊
if (!blockManagerInfo.contains(id)) {
// 獲取該executor對應的BlockManagerId
blockManagerIdByExecutor.get(id.executorId) match {
// 但是該block對應的executor已經有對應的BlockManager,則表示是舊的BlockManager,則把該Executor刪除掉
case Some(oldId) =>
logError("Got two different block manager registrations on same executor - "
+ s" will replace old one $oldId with new one $id")
// 從記憶體中移除該Executor以及Executor對應的BlockManager
removeExecutor(id.executorId)
case None =>
}
logInfo("Registering block manager %s with %s RAM, %s".format(
id.hostPort, Utils.bytesToString(maxMemSize), id))
// <ExecuotorId,BlockManagerId> 對映加入這個BlockManagerId
blockManagerIdByExecutor(id.executorId) = id
// 建立BlockManagerInfo,加入到<BlockManagerId, BlockManagerInfo>
blockManagerInfo(id) = new BlockManagerInfo(
id, System.currentTimeMillis(), maxMemSize, slaveEndpoint)
}
listenerBus.post(SparkListenerBlockManagerAdded(time, id, maxMemSize))
}
1.8updateBlockInfo 更新資料塊資訊
private def updateBlockInfo(
blockManagerId: BlockManagerId,
blockId: BlockId,
storageLevel: StorageLevel,
memSize: Long,
diskSize: Long,
externalBlockStoreSize: Long): Boolean = {
// 如果該blockManagerId還沒有註冊,則返回
if (!blockManagerInfo.contains(blockManagerId)) {
// 如果blockManagerId是driver上的BlockManager而且又不在本地,意思就是這個BlockManager是其他節點的
if (blockManagerId.isDriver && !isLocal) {
// We intentionally do not register the master (except in local mode),
// so we should not indicate failure.
return true
} else {
return false
}
}
// 如果沒有block,也不用更新block,所以返回
if (blockId == null) {
blockManagerInfo(blockManagerId).updateLastSeenMs()
return true
}
// 呼叫BlockManagerInfo的updateBlockInfo方法,更新block
blockManagerInfo(blockManagerId).updateBlockInfo(
blockId, storageLevel, memSize, diskSize, externalBlockStoreSize)
var locations: mutable.HashSet[BlockManagerId] = null
// 如果blockLocations包含blockId,則獲取block對應的所有BlockManager集合,否則建立空的集合
// 然後更新blockLocations集合
if (blockLocations.containsKey(blockId)) {
locations = blockLocations.get(blockId)
} else {
locations = new mutable.HashSet[BlockManagerId]
blockLocations.put(blockId, locations)
}
// 儲存級別有效,則向block對應的BlockManger集合裡新增該blockManagerId
// 如果無效,則移除之
if (storageLevel.isValid) {
locations.add(blockManagerId)
} else {
locations.remove(blockManagerId)
}
// 如果block對應的BlockManger集合為空,則沒有BlockManager與之對應,則從blockLocations刪除這個blockId
if (locations.size == 0) {
blockLocations.remove(blockId)
}
true
}
1.9 getPeers 獲取指定的blockManagerId是Executor的BlockManager,且不包括指定blockManagerId
private def getPeers(blockManagerId: BlockManagerId): Seq[BlockManagerId] = {
// 獲取所有BlockManagerId集合
val blockManagerIds = blockManagerInfo.keySet
// 如果包含指定的blockManagerId
if (blockManagerIds.contains(blockManagerId)) {
// 得到Executor的BlockManager,再得到和當前blockManagerId不相等的BlockMangerId集合
blockManagerIds.filterNot { _.isDriver }.filterNot { _ == blockManagerId }.toSeq
} else {
Seq.empty
}
}
二BlockManagerSlaveEndpoint
接收BlockManagerMasterEndpoint傳送過來的指令,然後執行該指令
2.1 receiveAndReply接受訊息
override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
// 接收master傳送過來的RemoveBlock訊息
case RemoveBlock(blockId) =>
doAsync[Boolean]("removing block " + blockId, context) {
// 呼叫BlockManager刪除block
blockManager.removeBlock(blockId)
true
}
// 接收master傳送過來的RemoveRdd訊息
case RemoveRdd(rddId) =>
doAsync[Int]("removing RDD " + rddId, context) {
// 呼叫BlockManager刪除rdd對應的block
blockManager.removeRdd(rddId)
}
// 接收master傳送過來的RemoveShuffle訊息
case RemoveShuffle(shuffleId) =>
doAsync[Boolean]("removing shuffle " + shuffleId, context) {
// 首先需要呼叫MapOutputTracker取消shuffleId的註冊的
if (mapOutputTracker != null) {
mapOutputTracker.unregisterShuffle(shuffleId)
}
// 刪除shuffle的後設資料
SparkEnv.get.shuffleManager.unregisterShuffle(shuffleId)
}
// 接收master傳送過來的RemoveBroadcast訊息
case RemoveBroadcast(broadcastId, _) =>
doAsync[Int]("removing broadcast " + broadcastId, context) {
// 呼叫BlockManagerd的removeBroadcast
blockManager.removeBroadcast(broadcastId, tellMaster = true)
}
// 接收訊息GetBlockStatus,呼叫blockManager的getStatus
case GetBlockStatus(blockId, _) =>
context.reply(blockManager.getStatus(blockId))
// 接收GetMatchingBlockIds訊息呼叫blockManager的getMatchingBlockIds方法
case GetMatchingBlockIds(filter, _) =>
context.reply(blockManager.getMatchingBlockIds(filter))
}
相關文章
- Spark原始碼分析之Worker啟動通訊機制Spark原始碼
- Spark原始碼分析之Checkpoint機制Spark原始碼
- Spark RPC框架原始碼分析(三)Spark心跳機制分析SparkRPC框架原始碼
- Dubbo原始碼分析(六)Dubbo通訊的編碼解碼機制原始碼
- ReactNative原始碼篇:通訊機制React原始碼
- 通過WordCount解析Spark RDD內部原始碼機制Spark原始碼
- Android訊息機制原始碼分析Android原始碼
- Spark原始碼分析之MemoryManagerSpark原始碼
- Spark原始碼分析之BlockStoreSpark原始碼BloC
- Android 原始碼分析之旅3 1 訊息機制原始碼分析Android原始碼
- Spark原始碼分析之DiskBlockMangaer分析Spark原始碼BloC
- Spark原始碼分析之cahce原理分析Spark原始碼
- 【Zookeeper】原始碼分析之網路通訊(一)原始碼
- Spark Shuffle機制詳細原始碼解析Spark原始碼
- 原始碼分析:Android訊息處理機制原始碼Android
- Android 系統原始碼-2:Binder 通訊機制Android原始碼
- 【Zookeeper】原始碼分析之Watcher機制(一)原始碼
- spark 原始碼分析之十三 -- SerializerManager剖析Spark原始碼
- 通過分析 JDK 原始碼研究 Hash 儲存機制JDK原始碼
- 【Zookeeper】原始碼分析之Watcher機制(三)之ZooKeeper原始碼
- 【Zookeeper】原始碼分析之網路通訊(三)之NettyServerCnxn原始碼NettyServer
- 【Zookeeper】原始碼分析之網路通訊(二)之NIOServerCnxn原始碼iOSServer
- Giraph原始碼分析(三)—— 訊息通訊原始碼
- spark 原始碼分析之十八 -- Spark儲存體系剖析Spark原始碼
- spark 原始碼分析之十五 -- Spark記憶體管理剖析Spark原始碼記憶體
- 【Zookeeper】原始碼分析之Watcher機制(二)之WatchManager原始碼
- spark 原始碼分析之十九 -- Stage的提交Spark原始碼
- 【Android原始碼】Handler 機制原始碼分析Android原始碼
- Dubbo 原始碼分析 - SPI 機制原始碼
- React原始碼分析 – 事件機制React原始碼事件
- React原始碼分析 - 事件機制React原始碼事件
- [原始碼解析] TensorFlow 分散式環境(8) --- 通訊機制原始碼分散式
- spark 原始碼分析之十六 -- Spark記憶體儲存剖析Spark原始碼記憶體
- Spark 原始碼分析系列Spark原始碼
- HashMap擴容機制原始碼分析HashMap原始碼
- Binder通訊機制
- JVM原始碼分析之Attach機制實現完全解讀JVM原始碼
- Android系統之Binder通訊機制Android