Spark原始碼分析之BlockStore
BlockStore是儲存block抽象類,子類包括DiskStore,MemoryStore以及ExternalBlockStore等
一 DiskStore 磁碟儲存
儲存資料塊(block)到磁碟,我我們可以在DiskStore中配置多個存放block的目錄,DiskBlockManager會根據 這些配置建立不同的資料夾,存放block
二 MemoryStore 記憶體儲存
# getSize 獲取指定blockId對應的block檔案大小
defgetSize(blockId:BlockId):
Long = {
diskManager.getFile(blockId.name).length
}
# put 放資料
def put(blockId: BlockId)(writeFunc: FileOutputStream => Unit): Unit = {
// 判斷指定的blockId對應的block的檔案是否存在
if (contains(blockId)) {
throw new IllegalStateException(s"Block $blockId is already present in the disk store")
}
logDebug(s"Attempting to put block $blockId")
val startTime = System.currentTimeMillis
// 獲取block所在檔案
val file = diskManager.getFile(blockId)
// 構建檔案輸出流
val fileOutputStream = new FileOutputStream(file)
var threwException: Boolean = true
try {
// 將資料寫入blockId指定的檔案
writeFunc(fileOutputStream)
threwException = false
} finally {
try {
Closeables.close(fileOutputStream, threwException)
} finally {
if (threwException) {
remove(blockId)
}
}
}
val finishTime = System.currentTimeMillis
logDebug("Block %s stored as %s file on disk in %d ms".format(
file.getName,
Utils.bytesToString(file.length()),
finishTime - startTime))
}
# putBytes 據指定的byte資料,將其寫入block檔案
def putBytes(blockId: BlockId, bytes: ChunkedByteBuffer): Unit = {
put(blockId) { fileOutputStream =>
val channel = fileOutputStream.getChannel
Utils.tryWithSafeFinally {
bytes.writeFully(channel)
} {
channel.close()
}
}
}
二 MemoryStore
2.1 核心屬性
BlockInfoManager blockInfoManager:跟蹤單個資料塊的後設資料
SerializerManager serializerManager:序列化管理器
MemoryManager memoryManager:記憶體管理器
BlockEvictionHandler blockEvictionHandler:回收block的處理器
LinkedHashMap[BlockId, MemoryEntry[_]] entries:存放在記憶體的block資料
HashMap[Long, Long] onHeapUnrollMemoryMap:一個<taskAttemptId,一個已經展開的block已經使用了的記憶體量>的對映
HashMap[Long, Long] offHeapUnrollMemoryMap:開的block已經使用的記憶體(對外儲存)
Long unrollMemoryThreshold:展開block之前初始化的記憶體閥值
2.2 重要的類和方法
# Long maxMemory :最大的記憶體大小
private def maxMemory: Long = {
memoryManager.maxOnHeapStorageMemory + memoryManager.maxOffHeapStorageMemory
}
# memoryUsed 已經使用的記憶體,包括正展開的block使用的記憶體
private def memoryUsed: Long = memoryManager.storageMemoryUsed
# blocksMemoryUsed 經寫完block佔用的記憶體,不包括正在展開的block
private def blocksMemoryUsed: Long = memoryManager.synchronized {
memoryUsed - currentUnrollMemory
}
# putBytes 往記憶體新增資料
往記憶體新增資料,如果記憶體足夠,則建立ByteBuffer,然後放進MemoryStore,否則不會建立ByteBuffer
def putBytes[T: ClassTag](blockId: BlockId,
size: Long, memoryMode: MemoryMode,
_bytes: () => ChunkedByteBuffer): Boolean = {
require(!contains(blockId), s"Block $blockId is already present in the MemoryStore")
// 通過MemoryManager申請storage記憶體
if (memoryManager.acquireStorageMemory(blockId, size, memoryMode)) {
// 如果為這個block申請到了足夠的記憶體
val bytes = _bytes()
assert(bytes.size == size)
// 建立一個SerializedMemoryEntry,然後放入記憶體
val entry = new SerializedMemoryEntry[T](bytes, memoryMode, implicitly[ClassTag[T]])
entries.synchronized {
entries.put(blockId, entry)
}
logInfo("Block %s stored as bytes in memory (estimated size %s, free %s)".format(
blockId, Utils.bytesToString(size), Utils.bytesToString(maxMemory - blocksMemoryUsed)))
true
} else {
false
}
}
# putIteratorAsValues 嘗試將一個迭代器放到block
private[storage] def putIteratorAsValues[T](
blockId: BlockId,
values: Iterator[T],
classTag: ClassTag[T]): Either[PartiallyUnrolledIterator[T], Long] = {
require(!contains(blockId), s"Block $blockId is already present in the MemoryStore")
// 目前為止已經展開了多少元素
var elementsUnrolled = 0
// 是否這裡還有足夠的記憶體繼續保持這個block展開
var keepUnrolling = true
// Initial per-task memory to request for unrolling blocks (bytes).
// 對於展開的資料塊,初始化配一個任務記憶體
val initialMemoryThreshold = unrollMemoryThreshold
// 多久檢查一次我們是否需要請求更多的記憶體
val memoryCheckPeriod = 16
// 這個task預留的記憶體
var memoryThreshold = initialMemoryThreshold
// 記憶體請求增長因子
val memoryGrowthFactor = 1.5
// 跟蹤特殊block或者putIterator操作展開的記憶體
var unrollMemoryUsedByThisBlock = 0L
// 對於展開的block構建一個vector,只用於新增,然後預計其大小
var vector = new SizeTrackingVector[T]()(classTag)
// 為任務預定記憶體用於展開指定的block,因為展開block也需要消費記憶體
keepUnrolling =
reserveUnrollMemoryForThisTask(blockId, initialMemoryThreshold, MemoryMode.ON_HEAP)
// 如果沒有預定的記憶體為展開指定的block,給出警告資訊
if (!keepUnrolling) {
logWarning(s"Failed to reserve initial memory threshold of " +
s"${Utils.bytesToString(initialMemoryThreshold)} for computing block $blockId in memory.")
} else {
// 否則這個unrollMemoryUsedByThisBlock 需要加上初始化記憶體閥值initialMemoryThreshold,表示預留給展開block的記憶體
unrollMemoryUsedByThisBlock += initialMemoryThreshold
}
// 安全展開block,定期檢查是否我們超過閥值
while (values.hasNext && keepUnrolling) {
// 迭代每一個vlaue
vector += values.next()
// 是否達到我們需要進行記憶體申請檢測
if (elementsUnrolled % memoryCheckPeriod == 0) {
// 如果滿足條件觸發了檢測,先獲取預估的大小,如果預估的大小超過了記憶體閥值
val currentSize = vector.estimateSize()
// 如果超過了閥值,則需要申請更多的記憶體,申請演算法(當前大小 * 記憶體增長因子 - 記憶體閥值)
if (currentSize >= memoryThreshold) {
val amountToRequest = (currentSize * memoryGrowthFactor - memoryThreshold).toLong
// 再次為task申請預定用於展開block的記憶體
keepUnrolling =
reserveUnrollMemoryForThisTask(blockId, amountToRequest, MemoryMode.ON_HEAP)
// 如果預訂成功
if (keepUnrolling) {
// 則更新unrollMemoryUsedByThisBlock
unrollMemoryUsedByThisBlock += amountToRequest
}
// 當前記憶體的閥值也需要更新了
memoryThreshold += amountToRequest
}
}
// 更新滾動元素
elementsUnrolled += 1
}
// 如果預定成功
if (keepUnrolling) {
// 將vector轉變位陣列
val arrayValues = vector.toArray
vector = null
// 建立一個反序列化的DeserializedMemoryEntry物件
val entry =
new DeserializedMemoryEntry[T](arrayValues, SizeEstimator.estimate(arrayValues), classTag)
val size = entry.size
def transferUnrollToStorage(amount: Long): Unit = {
// Synchronize so that transfer is atomic
memoryManager.synchronized {
releaseUnrollMemoryForThisTask(MemoryMode.ON_HEAP, amount)
val success = memoryManager.acquireStorageMemory(blockId, amount, MemoryMode.ON_HEAP)
assert(success, "transferring unroll memory to storage memory failed")
}
}
// Acquire storage memory if necessary to store this block in memory.
// 如果需要在記憶體儲存block,申請storage記憶體
val enoughStorageMemory = {
// 滾動這個block使用記憶體 < block的大小
if (unrollMemoryUsedByThisBlock <= size) {
// 我們需要申請額外的storage記憶體
val acquiredExtra =
memoryManager.acquireStorageMemory(
blockId, size - unrollMemoryUsedByThisBlock, MemoryMode.ON_HEAP)
if (acquiredExtra) {
transferUnrollToStorage(unrollMemoryUsedByThisBlock)
}
acquiredExtra
} else { // unrollMemoryUsedByThisBlock > size
val excessUnrollMemory = unrollMemoryUsedByThisBlock - size
releaseUnrollMemoryForThisTask(MemoryMode.ON_HEAP, excessUnrollMemory)
transferUnrollToStorage(size)
true
}
}
// storage記憶體足夠的話,將entry放入記憶體中
if (enoughStorageMemory) {
entries.synchronized {
entries.put(blockId, entry)
}
logInfo("Block %s stored as values in memory (estimated size %s, free %s)".format(
blockId, Utils.bytesToString(size), Utils.bytesToString(maxMemory - blocksMemoryUsed)))
Right(size)
} else {
assert(currentUnrollMemoryForThisTask >= unrollMemoryUsedByThisBlock,
"released too much unroll memory")
Left(new PartiallyUnrolledIterator(
this,
MemoryMode.ON_HEAP,
unrollMemoryUsedByThisBlock,
unrolled = arrayValues.toIterator,
rest = Iterator.empty))
}
} else {
// 沒有足夠展開記憶體用於開啟block
logUnrollFailureMessage(blockId, vector.estimateSize())
Left(new PartiallyUnrolledIterator(
this,
MemoryMode.ON_HEAP,
unrollMemoryUsedByThisBlock,
unrolled = vector.iterator,
rest = values))
}
}
# remove 從記憶體中刪除某一個blockId對應的資料
def remove(blockId: BlockId): Boolean = memoryManager.synchronized {
// 從記憶體中刪除
val entry = entries.synchronized {
entries.remove(blockId)
}
if (entry != null) {
entry match {
case SerializedMemoryEntry(buffer, _, _) => buffer.dispose()
case _ =>
}
// 開始釋放storage記憶體
memoryManager.releaseStorageMemory(entry.size, entry.memoryMode)
logDebug(s"Block $blockId of size ${entry.size} dropped " +
s"from memory (free ${maxMemory - blocksMemoryUsed})")
true
} else {
false
}
}
# evictBlocksToFreeSpace 試圖回收block已釋放記憶體空間
private[spark] def evictBlocksToFreeSpace(
blockId: Option[BlockId], space: Long,
memoryMode: MemoryMode): Long = {
assert(space > 0)
memoryManager.synchronized {
var freedMemory = 0L // 剩餘記憶體
val rddToAdd = blockId.flatMap(getRddId)
// 選中的block
val selectedBlocks = new ArrayBuffer[BlockId]
// 判斷block是否可以被回收
def blockIsEvictable(blockId: BlockId, entry: MemoryEntry[_]): Boolean = {
entry.memoryMode == memoryMode && (rddToAdd.isEmpty || rddToAdd != getRddId(blockId))
}
entries.synchronized {
// 遍歷每一個entry元素
val iterator = entries.entrySet().iterator()
// 剩餘的記憶體小於block大小
while (freedMemory < space && iterator.hasNext) {
val pair = iterator.next()
val blockId = pair.getKey
val entry = pair.getValue
if (blockIsEvictable(blockId, entry)) {
// 更新被選中block
if (blockInfoManager.lockForWriting(blockId, blocking = false).isDefined) {
selectedBlocks += blockId
freedMemory += pair.getValue.size
}
}
}
}
// 刪除block
def dropBlock[T](blockId: BlockId, entry: MemoryEntry[T]): Unit = {
val data = entry match {
case DeserializedMemoryEntry(values, _, _) => Left(values)
case SerializedMemoryEntry(buffer, _, _) => Right(buffer)
}
// 從記憶體中刪除
val newEffectiveStorageLevel =
blockEvictionHandler.dropFromMemory(blockId, () => data)(entry.classTag)
if (newEffectiveStorageLevel.isValid) {
// The block is still present in at least one store, so release the lock
// but don't delete the block info
blockInfoManager.unlock(blockId)
} else {
// The block isn't present in any store, so delete the block info so that the
// block can be stored again
blockInfoManager.removeBlock(blockId)
}
}
// 如果空閒記憶體大於block大小的時候
if (freedMemory >= space) {
logInfo(s"${selectedBlocks.size} blocks selected for dropping " +
s"(${Utils.bytesToString(freedMemory)} bytes)")
// 開始遍歷那些可以從記憶體中移除的blockId,並且呼叫dropBlock進行移除
for (blockId <- selectedBlocks) {
val entry = entries.synchronized { entries.get(blockId) }
// This should never be null as only one task should be dropping
// blocks and removing entries. However the check is still here for
// future safety.
if (entry != null) {
dropBlock(blockId, entry)
}
}
logInfo(s"After dropping ${selectedBlocks.size} blocks, " +
s"free memory is ${Utils.bytesToString(maxMemory - blocksMemoryUsed)}")
freedMemory
} else {
blockId.foreach { id =>
logInfo(s"Will not store $id")
}
selectedBlocks.foreach { id =>
blockInfoManager.unlock(id)
}
0L
}
}
}
# reserveUnrollMemoryForThisTask 為任務預定記憶體用於展開指定的block,因為展開block也需要消費記憶體
def reserveUnrollMemoryForThisTask(
blockId: BlockId,
memory: Long,
memoryMode: MemoryMode): Boolean = {
memoryManager.synchronized {
val success = memoryManager.acquireUnrollMemory(blockId, memory, memoryMode)
if (success) {
val taskAttemptId = currentTaskAttemptId()
val unrollMemoryMap = memoryMode match {
case MemoryMode.ON_HEAP => onHeapUnrollMemoryMap
case MemoryMode.OFF_HEAP => offHeapUnrollMemoryMap
}
unrollMemoryMap(taskAttemptId) = unrollMemoryMap.getOrElse(taskAttemptId, 0L) + memory
}
success
}
}
def releaseUnrollMemoryForThisTask(memoryMode: MemoryMode, memory: Long = Long.MaxValue): Unit = {
val taskAttemptId = currentTaskAttemptId()
memoryManager.synchronized {
val unrollMemoryMap = memoryMode match {
case MemoryMode.ON_HEAP => onHeapUnrollMemoryMap
case MemoryMode.OFF_HEAP => offHeapUnrollMemoryMap
}
if (unrollMemoryMap.contains(taskAttemptId)) {
val memoryToRelease = math.min(memory, unrollMemoryMap(taskAttemptId))
if (memoryToRelease > 0) {
unrollMemoryMap(taskAttemptId) -= memoryToRelease
memoryManager.releaseUnrollMemory(memoryToRelease, memoryMode)
}
if (unrollMemoryMap(taskAttemptId) == 0) {
unrollMemoryMap.remove(taskAttemptId)
}
}
}
}
相關文章
- Spark原始碼分析之MemoryManagerSpark原始碼
- Spark原始碼分析之DiskBlockMangaer分析Spark原始碼BloC
- Spark原始碼分析之cahce原理分析Spark原始碼
- spark 原始碼分析之十三 -- SerializerManager剖析Spark原始碼
- Spark原始碼分析之Checkpoint機制Spark原始碼
- spark 原始碼分析之十八 -- Spark儲存體系剖析Spark原始碼
- spark 原始碼分析之十五 -- Spark記憶體管理剖析Spark原始碼記憶體
- spark 原始碼分析之十九 -- Stage的提交Spark原始碼
- spark 原始碼分析之十六 -- Spark記憶體儲存剖析Spark原始碼記憶體
- Spark 原始碼分析系列Spark原始碼
- Spark原始碼分析之BlockManager通訊機制Spark原始碼BloC
- Spark原始碼分析之Worker啟動通訊機制Spark原始碼
- Spark job分配流程原始碼分析Spark原始碼
- Spark RPC框架原始碼分析(三)Spark心跳機制分析SparkRPC框架原始碼
- Spark 原始碼解析之SparkContextSpark原始碼Context
- Spark原始碼解析之Shuffle WriterSpark原始碼
- Spark原始碼解析之Storage模組Spark原始碼
- spark 原始碼分析之十四 -- broadcast 是如何實現的?Spark原始碼AST
- spark core原始碼分析3 Master HASpark原始碼AST
- spark 原始碼分析之十九 -- DAG的生成和Stage的劃分Spark原始碼
- Guava 原始碼分析之 EventBus 原始碼分析Guava原始碼
- Android 原始碼分析之 AsyncTask 原始碼分析Android原始碼
- spark原始碼之任務提交過程Spark原始碼
- Spark RPC框架原始碼分析(一)簡述SparkRPC框架原始碼
- spark streaming原始碼分析1 StreamingContextSpark原始碼GCContext
- spark core原始碼分析2 master啟動流程Spark原始碼AST
- spark core原始碼分析4 worker啟動流程Spark原始碼
- Spark on Yarn 任務提交流程原始碼分析SparkYarn原始碼
- 原始碼分析之 HashMap原始碼HashMap
- 原始碼分析之AbstractQueuedSynchronizer原始碼
- 原始碼分析之ArrayList原始碼
- 原始碼|jdk原始碼之HashMap分析(一)原始碼JDKHashMap
- 原始碼|jdk原始碼之HashMap分析(二)原始碼JDKHashMap
- redis原始碼分析(二)、redis原始碼分析之sds字串Redis原始碼字串
- Spark原始碼-SparkContext原始碼解析Spark原始碼Context
- spark streaming原始碼分析4 DStream相關APISpark原始碼API
- 死磕 jdk原始碼之HashMap原始碼分析JDK原始碼HashMap
- JUC之CountDownLatch原始碼分析CountDownLatch原始碼