Spark原始碼分析之BlockStore

happy19870612發表於2017-11-11

BlockStore是儲存block抽象類,子類包括DiskStore,MemoryStore以及ExternalBlockStore等

 

一 DiskStore 磁碟儲存

儲存資料塊(block)到磁碟,我我們可以在DiskStore中配置多個存放block的目錄,DiskBlockManager會根據 這些配置建立不同的資料夾,存放block

 

二 MemoryStore 記憶體儲存

# getSize 獲取指定blockId對應的block檔案大小

defgetSize(blockId:BlockId): Long = {
  diskManager.getFile(blockId.name).length
}

 

# put 放資料

def put(blockId: BlockId)(writeFunc: FileOutputStream => Unit): Unit = {
  // 判斷指定的blockId對應的block的檔案是否存在
  if (contains(blockId)) {
    throw new IllegalStateException(s"Block $blockId is already present in the disk store")
  }
  logDebug(s"Attempting to put block $blockId")
  val startTime = System.currentTimeMillis
  // 獲取block所在檔案
  val file = diskManager.getFile(blockId)
  // 構建檔案輸出流
  val fileOutputStream = new FileOutputStream(file)
  var threwException: Boolean = true
  try {
    // 將資料寫入blockId指定的檔案
    writeFunc(fileOutputStream)
    threwException = false
  } finally {
    try {
      Closeables.close(fileOutputStream, threwException)
    } finally {
       if (threwException) {
        remove(blockId)
      }
    }
  }
  val finishTime = System.currentTimeMillis
  logDebug("Block %s stored as %s file on disk in %d ms".format(
    file.getName,
    Utils.bytesToString(file.length()),
    finishTime - startTime))
}

 

# putBytes 據指定的byte資料,將其寫入block檔案

def putBytes(blockId: BlockId, bytes: ChunkedByteBuffer): Unit = {
  put(blockId) { fileOutputStream =>
    val channel = fileOutputStream.getChannel
    Utils.tryWithSafeFinally {
      bytes.writeFully(channel)
    } {
      channel.close()
    }
  }
}

 

二 MemoryStore

2.1 核心屬性

BlockInfoManager blockInfoManager:跟蹤單個資料塊的後設資料

SerializerManager serializerManager:序列化管理器

MemoryManager memoryManager:記憶體管理器

BlockEvictionHandler blockEvictionHandler:回收block的處理器

LinkedHashMap[BlockId, MemoryEntry[_]] entries:存放在記憶體的block資料

HashMap[Long, Long] onHeapUnrollMemoryMap:一個<taskAttemptId,一個已經展開的block已經使用了的記憶體量>的對映

HashMap[Long, Long] offHeapUnrollMemoryMap:開的block已經使用的記憶體(對外儲存)

Long unrollMemoryThreshold:展開block之前初始化的記憶體閥值

2.2 重要的類和方法

# Long maxMemory :最大的記憶體大小

private def maxMemory: Long = {
  memoryManager.maxOnHeapStorageMemory + memoryManager.maxOffHeapStorageMemory
}

 

# memoryUsed 已經使用的記憶體,包括正展開的block使用的記憶體

private def memoryUsed: Long = memoryManager.storageMemoryUsed

 

# blocksMemoryUsed 經寫完block佔用的記憶體,不包括正在展開的block

private def blocksMemoryUsed: Long = memoryManager.synchronized {
  memoryUsed - currentUnrollMemory
}

 

# putBytes 往記憶體新增資料

往記憶體新增資料,如果記憶體足夠,則建立ByteBuffer,然後放進MemoryStore,否則不會建立ByteBuffer

def putBytes[T: ClassTag](blockId: BlockId,
    size: Long, memoryMode: MemoryMode,
    _bytes: () => ChunkedByteBuffer): Boolean = {
  require(!contains(blockId), s"Block $blockId is already present in the MemoryStore")
  // 通過MemoryManager申請storage記憶體
  if (memoryManager.acquireStorageMemory(blockId, size, memoryMode)) {
   // 如果為這個block申請到了足夠的記憶體
    val bytes = _bytes()
    assert(bytes.size == size)
    // 建立一個SerializedMemoryEntry,然後放入記憶體
    val entry = new SerializedMemoryEntry[T](bytes, memoryMode, implicitly[ClassTag[T]])
    entries.synchronized {
      entries.put(blockId, entry)
    }
    logInfo("Block %s stored as bytes in memory (estimated size %s, free %s)".format(
      blockId, Utils.bytesToString(size), Utils.bytesToString(maxMemory - blocksMemoryUsed)))
    true
  } else {
    false
  }
}

 

# putIteratorAsValues 嘗試將一個迭代器放到block

private[storage] def putIteratorAsValues[T](
    blockId: BlockId,
    values: Iterator[T],
    classTag: ClassTag[T]): Either[PartiallyUnrolledIterator[T], Long] = {

  require(!contains(blockId), s"Block $blockId is already present in the MemoryStore")

  // 目前為止已經展開了多少元素
  var elementsUnrolled = 0
  // 是否這裡還有足夠的記憶體繼續保持這個block展開
  var keepUnrolling = true
  // Initial per-task memory to request for unrolling blocks (bytes).
  // 對於展開的資料塊,初始化配一個任務記憶體
  val initialMemoryThreshold = unrollMemoryThreshold
  // 多久檢查一次我們是否需要請求更多的記憶體
  val memoryCheckPeriod = 16
  // 這個task預留的記憶體
  var memoryThreshold = initialMemoryThreshold
  // 記憶體請求增長因子
  val memoryGrowthFactor = 1.5
  // 跟蹤特殊block或者putIterator操作展開的記憶體
  var unrollMemoryUsedByThisBlock = 0L
  // 對於展開的block構建一個vector,只用於新增,然後預計其大小
  var vector = new SizeTrackingVector[T]()(classTag)

  // 為任務預定記憶體用於展開指定的block,因為展開block也需要消費記憶體
  keepUnrolling =
    reserveUnrollMemoryForThisTask(blockId, initialMemoryThreshold, MemoryMode.ON_HEAP)
  // 如果沒有預定的記憶體為展開指定的block,給出警告資訊
  if (!keepUnrolling) {
    logWarning(s"Failed to reserve initial memory threshold of " +
      s"${Utils.bytesToString(initialMemoryThreshold)} for computing block $blockId in memory.")
  } else {
    // 否則這個unrollMemoryUsedByThisBlock 需要加上初始化記憶體閥值initialMemoryThreshold,表示預留給展開block的記憶體
    unrollMemoryUsedByThisBlock += initialMemoryThreshold
  }

  // 安全展開block,定期檢查是否我們超過閥值
  while (values.hasNext && keepUnrolling) {
    // 迭代每一個vlaue
    vector += values.next()
    // 是否達到我們需要進行記憶體申請檢測
    if (elementsUnrolled % memoryCheckPeriod == 0) {
      // 如果滿足條件觸發了檢測,先獲取預估的大小,如果預估的大小超過了記憶體閥值
      val currentSize = vector.estimateSize()
      // 如果超過了閥值,則需要申請更多的記憶體,申請演算法(當前大小 * 記憶體增長因子 - 記憶體閥值)
      if (currentSize >= memoryThreshold) {
        val amountToRequest = (currentSize * memoryGrowthFactor - memoryThreshold).toLong
        // 再次為task申請預定用於展開block的記憶體
        keepUnrolling =
          reserveUnrollMemoryForThisTask(blockId, amountToRequest, MemoryMode.ON_HEAP)
        // 如果預訂成功
        if (keepUnrolling) {
          // 則更新unrollMemoryUsedByThisBlock
          unrollMemoryUsedByThisBlock += amountToRequest
        }
        // 當前記憶體的閥值也需要更新了
        memoryThreshold += amountToRequest
      }
    }
    // 更新滾動元素
    elementsUnrolled += 1
  }
  // 如果預定成功
  if (keepUnrolling) {
    // vector轉變位陣列
    val arrayValues = vector.toArray
    vector = null
    // 建立一個反序列化的DeserializedMemoryEntry物件
    val entry =
      new DeserializedMemoryEntry[T](arrayValues, SizeEstimator.estimate(arrayValues), classTag)
    val size = entry.size
    def transferUnrollToStorage(amount: Long): Unit = {
      // Synchronize so that transfer is atomic
      memoryManager.synchronized {
        releaseUnrollMemoryForThisTask(MemoryMode.ON_HEAP, amount)
        val success = memoryManager.acquireStorageMemory(blockId, amount, MemoryMode.ON_HEAP)
        assert(success, "transferring unroll memory to storage memory failed")
      }
    }
    // Acquire storage memory if necessary to store this block in memory.
    // 如果需要在記憶體儲存block,申請storage記憶體
    val enoughStorageMemory = {
      // 滾動這個block使用記憶體 < block的大小
      if (unrollMemoryUsedByThisBlock <= size) {
        // 我們需要申請額外的storage記憶體
        val acquiredExtra =
          memoryManager.acquireStorageMemory(
            blockId, size - unrollMemoryUsedByThisBlock, MemoryMode.ON_HEAP)
        if (acquiredExtra) {
          transferUnrollToStorage(unrollMemoryUsedByThisBlock)
        }
        acquiredExtra
      } else { // unrollMemoryUsedByThisBlock > size
        val excessUnrollMemory = unrollMemoryUsedByThisBlock - size
        releaseUnrollMemoryForThisTask(MemoryMode.ON_HEAP, excessUnrollMemory)
        transferUnrollToStorage(size)
        true
      }
    }
    // storage記憶體足夠的話,entry放入記憶體中
    if (enoughStorageMemory) {
      entries.synchronized {
        entries.put(blockId, entry)
      }
      logInfo("Block %s stored as values in memory (estimated size %s, free %s)".format(
        blockId, Utils.bytesToString(size), Utils.bytesToString(maxMemory - blocksMemoryUsed)))
      Right(size)
    } else {
      assert(currentUnrollMemoryForThisTask >= unrollMemoryUsedByThisBlock,
        "released too much unroll memory")
      Left(new PartiallyUnrolledIterator(
        this,
        MemoryMode.ON_HEAP,
        unrollMemoryUsedByThisBlock,
        unrolled = arrayValues.toIterator,
        rest = Iterator.empty))
    }
  } else {
    // 沒有足夠展開記憶體用於開啟block
    logUnrollFailureMessage(blockId, vector.estimateSize())
    Left(new PartiallyUnrolledIterator(
      this,
      MemoryMode.ON_HEAP,
      unrollMemoryUsedByThisBlock,
      unrolled = vector.iterator,
      rest = values))
  }
}

 

# remove 從記憶體中刪除某一個blockId對應的資料

def remove(blockId: BlockId): Boolean = memoryManager.synchronized {
  // 從記憶體中刪除
  val entry = entries.synchronized {
    entries.remove(blockId)
  }
  if (entry != null) {
    entry match {
      case SerializedMemoryEntry(buffer, _, _) => buffer.dispose()
      case _ =>
    }
    // 開始釋放storage記憶體
    memoryManager.releaseStorageMemory(entry.size, entry.memoryMode)
    logDebug(s"Block $blockId of size ${entry.size} dropped " +
      s"from memory (free ${maxMemory - blocksMemoryUsed})")
    true
  } else {
    false
  }
}

 

# evictBlocksToFreeSpace 試圖回收block已釋放記憶體空間

private[spark] def evictBlocksToFreeSpace(
    blockId: Option[BlockId], space: Long,
    memoryMode: MemoryMode): Long = {
  assert(space > 0)

  memoryManager.synchronized {
    var freedMemory = 0L // 剩餘記憶體
    val rddToAdd = blockId.flatMap(getRddId)
    // 選中的block
    val selectedBlocks = new ArrayBuffer[BlockId]
    // 判斷block是否可以被回收
    def blockIsEvictable(blockId: BlockId, entry: MemoryEntry[_]): Boolean = {
      entry.memoryMode == memoryMode && (rddToAdd.isEmpty || rddToAdd != getRddId(blockId))
    }

    entries.synchronized {
      // 遍歷每一個entry元素
      val iterator = entries.entrySet().iterator()
      // 剩餘的記憶體小於block大小
      while (freedMemory < space && iterator.hasNext) {
        val pair = iterator.next()
        val blockId = pair.getKey
        val entry = pair.getValue
        if (blockIsEvictable(blockId, entry)) {
          // 更新被選中block
          if (blockInfoManager.lockForWriting(blockId, blocking = false).isDefined) {
            selectedBlocks += blockId
            freedMemory += pair.getValue.size
          }
        }
      }
    }
    // 刪除block
    def dropBlock[T](blockId: BlockId, entry: MemoryEntry[T]): Unit = {
      val data = entry match {
        case DeserializedMemoryEntry(values, _, _) => Left(values)
        case SerializedMemoryEntry(buffer, _, _) => Right(buffer)
      }
      // 從記憶體中刪除
      val newEffectiveStorageLevel =
        blockEvictionHandler.dropFromMemory(blockId, () => data)(entry.classTag)
      if (newEffectiveStorageLevel.isValid) {
        // The block is still present in at least one store, so release the lock
        // but don't delete the block info
        blockInfoManager.unlock(blockId)
      } else {
        // The block isn't present in any store, so delete the block info so that the
        // block can be stored again
        blockInfoManager.removeBlock(blockId)
      }
    }
    // 如果空閒記憶體大於block大小的時候
    if (freedMemory >= space) {
      logInfo(s"${selectedBlocks.size} blocks selected for dropping " +
        s"(${Utils.bytesToString(freedMemory)} bytes)")
      // 開始遍歷那些可以從記憶體中移除的blockId,並且呼叫dropBlock進行移除
      for (blockId <- selectedBlocks) {
        val entry = entries.synchronized { entries.get(blockId) }
        // This should never be null as only one task should be dropping
        // blocks and removing entries. However the check is still here for
        // future safety.
        if (entry != null) {
          dropBlock(blockId, entry)
        }
      }
      logInfo(s"After dropping ${selectedBlocks.size} blocks, " +
        s"free memory is ${Utils.bytesToString(maxMemory - blocksMemoryUsed)}")
      freedMemory
    } else {
      blockId.foreach { id =>
        logInfo(s"Will not store $id")
      }
      selectedBlocks.foreach { id =>
        blockInfoManager.unlock(id)
      }
      0L
    }
  }
}

 

# reserveUnrollMemoryForThisTask 為任務預定記憶體用於展開指定的block,因為展開block也需要消費記憶體

def reserveUnrollMemoryForThisTask(
    blockId: BlockId,
    memory: Long,
    memoryMode: MemoryMode): Boolean = {
  memoryManager.synchronized {
    val success = memoryManager.acquireUnrollMemory(blockId, memory, memoryMode)
    if (success) {
      val taskAttemptId = currentTaskAttemptId()
      val unrollMemoryMap = memoryMode match {
        case MemoryMode.ON_HEAP => onHeapUnrollMemoryMap
        case MemoryMode.OFF_HEAP => offHeapUnrollMemoryMap
      }
      unrollMemoryMap(taskAttemptId) = unrollMemoryMap.getOrElse(taskAttemptId, 0L) + memory
    }
    success
  }
}

 

def releaseUnrollMemoryForThisTask(memoryMode: MemoryMode, memory: Long = Long.MaxValue): Unit = {
  val taskAttemptId = currentTaskAttemptId()
  memoryManager.synchronized {
    val unrollMemoryMap = memoryMode match {
      case MemoryMode.ON_HEAP => onHeapUnrollMemoryMap
      case MemoryMode.OFF_HEAP => offHeapUnrollMemoryMap
    }
    if (unrollMemoryMap.contains(taskAttemptId)) {
      val memoryToRelease = math.min(memory, unrollMemoryMap(taskAttemptId))
      if (memoryToRelease > 0) {
        unrollMemoryMap(taskAttemptId) -= memoryToRelease
        memoryManager.releaseUnrollMemory(memoryToRelease, memoryMode)
      }
      if (unrollMemoryMap(taskAttemptId) == 0) {
        unrollMemoryMap.remove(taskAttemptId)
      }
    }
  }
}

 

 

 

 

相關文章