Spark原始碼分析之MemoryManager

happy19870612發表於2017-11-11

Spark原始碼

它會強制管理儲存(storage)和執行(execution)之間的記憶體使用

# 記錄用了多少 storage memory 和 execution memory

# 申請 storage、execution 和 unroll memory

# 釋放 storage 和 execution memory

execution memory: 是指 shuffles，joins，sorts 和 aggregation 的計算操作

storage memory：是指persist或者cache快取資料到記憶體等

unroll memory: 則是指展開block本身就需要耗費記憶體，好比開啟檔案，開啟檔案我們是需要耗費記憶體的

MemoryManager根據spark.memory.useLegacyMode這個配置項決定你是否使用遺留的MemoryManager策略即StaticMemoryManager。預設是不使用StaticMemoryManager，而是UnifiedMemoryManager。

一 MemoryManager

1.1 核心屬性

Int numCores：核數

Long onHeapStorageMemory：堆內storage 記憶體大小

Long onHeapExecutionMemory: 堆內execution記憶體大小

StorageMemoryPool onHeapStorageMemoryPool：建立堆內storage記憶體池

StorageMemoryPool offHeapStorageMemoryPool：建立堆外storage記憶體池

ExecutionMemoryPool onHeapExecutionMemoryPool：建立堆內execution記憶體池

ExecutionMemoryPool offHeapExecutionMemoryPool：建立堆外execution記憶體池

Long maxOffHeapMemory：最大的對外記憶體大小，可以由spark.memory.offHeap.size配置，如果要配置必須啟用了才可以生效spark.memory.offHeap.enabled

Long maxOnHeapStorageMemory：最大的堆內storage記憶體大小

Long maxOffHeapStorageMemory 最大的堆外storage記憶體大小

1.2 重要方法

# 釋放numBytes位元組的執行記憶體

defreleaseExecutionMemory(
    numBytes: Long,
    taskAttemptId: Long,
    memoryMode: MemoryMode): Unit = synchronized {
memoryMode match{
    case MemoryMode.ON_HEAP=> onHeapExecutionMemoryPool.releaseMemory(numBytes,taskAttemptId)
    case MemoryMode.OFF_HEAP=> offHeapExecutionMemoryPool.releaseMemory(numBytes,taskAttemptId)
}
}

# 釋放指定task的所有execution記憶體

private[memory] def releaseAllExecutionMemoryForTask(taskAttemptId: Long): Long = synchronized {
  onHeapExecutionMemoryPool.releaseAllMemoryForTask(taskAttemptId) +
    offHeapExecutionMemoryPool.releaseAllMemoryForTask(taskAttemptId)
}

# 釋放N位元組儲存記憶體

def releaseStorageMemory(numBytes: Long, memoryMode: MemoryMode): Unit = synchronized {
  memoryMode match {
    case MemoryMode.ON_HEAP => onHeapStorageMemoryPool.releaseMemory(numBytes)
    case MemoryMode.OFF_HEAP => offHeapStorageMemoryPool.releaseMemory(numBytes)
  }
}

# 釋放所有儲存記憶體

final def releaseAllStorageMemory(): Unit = synchronized {
  onHeapStorageMemoryPool.releaseAllMemory()
  offHeapStorageMemoryPool.releaseAllMemory()
}

二 StaticMemoryManager

Executor的記憶體界限分明，分別由3部分組成：execution,storage和system。對各部分記憶體靜態劃分好後便不可變化

# executor:execution記憶體大小通過設定spark.shuffle.memoryFraction引數來控制大小，預設為0.2。

為了避免shuffle，join，排序和聚合這些操作直接將資料寫入磁碟，所設定的buffer大小，減少了磁碟讀寫的次數。

#storage: storage記憶體大小通過設定spark.storage.memoryFraction引數來控制大小，預設為0.6。

用於儲存使用者顯示呼叫的persist,cache,broadcast等命令儲存的資料空間。

#system:程式執行需要的空間，儲存一些spark內部的後設資料資訊，使用者的資料結構，避免一些不尋常的大記錄帶來的OOM。

這種劃分方式，在某些時候可能會帶來一定的資源浪費，比如我對cache或者persist沒啥要求，那麼storage的記憶體就剩餘了

由於很多屬性都繼承了父類MemoryManager，在這裡不做贅述。

# maxUnrollMemory

最大的block展開記憶體空間，預設是佔用最大儲存記憶體的20%

private val maxUnrollMemory: Long = {
  (maxOnHeapStorageMemory * conf.getDouble("spark.storage.unrollFraction", 0.2)).toLong
}

# 申請分配storage記憶體,注意StaticMemoryManager不支援堆外記憶體

override def acquireStorageMemory(
    blockId: BlockId,
    numBytes: Long,
    memoryMode: MemoryMode): Boolean = synchronized {
  require(memoryMode != MemoryMode.OFF_HEAP,
    "StaticMemoryManager does not support off-heap storage memory")
  // 要申請的空間大小超過最大的storage記憶體,肯定失敗
  if (numBytes > maxOnHeapStorageMemory) {
    // Fail fast if the block simply won't fit
    logInfo(s"Will not store $blockId as the required space ($numBytes bytes) exceeds our " +
      s"memory limit ($maxOnHeapStorageMemory bytes)")
    false
  } else {
    // 呼叫StorageMemoryPool分配numBytes位元組記憶體
    onHeapStorageMemoryPool.acquireMemory(blockId, numBytes)
  }
}

# acquireUnrollMemory 用於申請展開block的記憶體

override def acquireUnrollMemory(
    blockId: BlockId, numBytes: Long,
    memoryMode: MemoryMode): Boolean = synchronized {
  require(memoryMode != MemoryMode.OFF_HEAP,
    "StaticMemoryManager does not support off-heap unroll memory")
  // 當前storage記憶體用於展開block的所需要記憶體
  val currentUnrollMemory = onHeapStorageMemoryPool.memoryStore.currentUnrollMemory
  // 當前storage中獲取空閒的記憶體
  val freeMemory = onHeapStorageMemoryPool.memoryFree
  // 判斷還剩餘的可用於展開block的記憶體-還剩餘的記憶體,如果小於或者等0表示不夠分配了，沒有空閒記憶體可供分配
  val maxNumBytesToFree = math.max(0, maxUnrollMemory - currentUnrollMemory - freeMemory)
  val numBytesToFree = math.max(0, math.min(maxNumBytesToFree, numBytes - freeMemory))
  onHeapStorageMemoryPool.acquireMemory(blockId, numBytes, numBytesToFree)
}

# acquireExecutionMemory申請執行記憶體

override def acquireExecutionMemory(
    numBytes: Long,
    taskAttemptId: Long,
    memoryMode: MemoryMode): Long = synchronized {
  memoryMode match {
    case MemoryMode.ON_HEAP => onHeapExecutionMemoryPool.acquireMemory(numBytes, taskAttemptId)
    case MemoryMode.OFF_HEAP => offHeapExecutionMemoryPool.acquireMemory(numBytes, taskAttemptId)
  }
}

# getMaxStorageMemory 返回有效的最大的storage記憶體空間

private def getMaxStorageMemory(conf: SparkConf): Long = {
  // 系統最大記憶體記憶體
  val systemMaxMemory = conf.getLong("spark.testing.memory", Runtime.getRuntime.maxMemory)
  // 記憶體佔用比例
  val memoryFraction = conf.getDouble("spark.storage.memoryFraction", 0.6)
  // 安全比例
  val safetyFraction = conf.getDouble("spark.storage.safetyFraction", 0.9)
  (systemMaxMemory * memoryFraction * safetyFraction).toLong
}

# getMaxExecutionMemory 返回最大的execution記憶體空間

private def getMaxExecutionMemory(conf: SparkConf): Long = {
  // 系統記憶體空間
  val systemMaxMemory = conf.getLong("spark.testing.memory", Runtime.getRuntime.maxMemory)
  // 如果系統記憶體空間 < 最小記憶體空間，丟擲異常
  if (systemMaxMemory < MIN_MEMORY_BYTES) {
    throw new IllegalArgumentException(s"System memory $systemMaxMemory must " +
      s"be at least $MIN_MEMORY_BYTES. Please increase heap size using the --driver-memory " +
      s"option or spark.driver.memory in Spark configuration.")
  }
  // 如果指定了執行記憶體空間
  if (conf.contains("spark.executor.memory")) {
    // 獲取執行執行記憶體空間
    val executorMemory = conf.getSizeAsBytes("spark.executor.memory")
    // 如果執行記憶體空間 < 最小記憶體，丟擲異常
    if (executorMemory < MIN_MEMORY_BYTES) {
      throw new IllegalArgumentException(s"Executor memory $executorMemory must be at least " +
        s"$MIN_MEMORY_BYTES. Please increase executor memory using the " +
        s"--executor-memory option or spark.executor.memory in Spark configuration.")
    }
  }
  // shuffle記憶體佔用比例
  val memoryFraction = conf.getDouble("spark.shuffle.memoryFraction", 0.2)
  // shuffle記憶體安全比例
  val safetyFraction = conf.getDouble("spark.shuffle.safetyFraction", 0.8)
  (systemMaxMemory * memoryFraction * safetyFraction).toLong
}

三UnifiedMemoryManager

由於傳統的StaticMemoryManager存在資源浪費問題，所以引入了這個MemoryManager。UnifiedMemoryManager管理機制淡化了execution空間和storage空間的邊界，讓它們之間可以相互借記憶體。

它們總共可用的記憶體由spark.memory.fraction決定，預設0.6.可使用的堆記憶體比例 * 可使用的記憶體。在該空間內部，對execution和storage進行了進一步的劃分。由spark.memory.storageFraction決定

# 計算最大的儲存記憶體

計算最大的儲存記憶體 = 最大記憶體 - 最大執行記憶體

override def maxOnHeapStorageMemory: Long = synchronized {
  maxHeapMemory - onHeapExecutionMemoryPool.memoryUsed
}

# 計算最大的堆外儲存記憶體

計算最大的堆外儲存記憶體 = 最大堆外記憶體 - 最大堆外執行記憶體

override def maxOffHeapStorageMemory: Long = synchronized {
  maxOffHeapMemory - offHeapExecutionMemoryPool.memoryUsed
}

# acquireExecutionMemory 申請執行記憶體

override private[memory] def acquireExecutionMemory(
    numBytes: Long, taskAttemptId: Long,
    memoryMode: MemoryMode): Long = synchronized {
  assertInvariants()
  assert(numBytes >= 0)
  // 跟據不同記憶體模式，構建不同的元件和初始值
  val (executionPool, storagePool, storageRegionSize, maxMemory) = memoryMode match {
    case MemoryMode.ON_HEAP => (
      onHeapExecutionMemoryPool,
      onHeapStorageMemoryPool,
      onHeapStorageRegionSize,
      maxHeapMemory)
    case MemoryMode.OFF_HEAP => (
      offHeapExecutionMemoryPool,
      offHeapStorageMemoryPool,
      offHeapStorageMemory,
      maxOffHeapMemory)
  }

  // 通過回收快取的block，會增加執行記憶體，從而儲存記憶體量就佔用記憶體量減少了
  // 當為task申請記憶體的實時呢，執行記憶體需要多次嘗試，每一次嘗試可能都會回收一些儲存記憶體
  def maybeGrowExecutionPool(extraMemoryNeeded: Long): Unit = {
    // 如果需要申請的記憶體大於0
    if (extraMemoryNeeded > 0) {
      // 計算execution 可以借到的storage的記憶體，應該是storage的空閒記憶體空間和可借出的記憶體較大者
      val memoryReclaimableFromStorage = math.max(
        storagePool.memoryFree,// storage的空閒記憶體空間
        storagePool.poolSize - storageRegionSize) // 可借出的記憶體
      // 如果可以借到記憶體
      if (memoryReclaimableFromStorage > 0) {
        // 減小pool大小，釋放一些記憶體空間
        val spaceToReclaim = storagePool.freeSpaceToShrinkPool(
          math.min(extraMemoryNeeded, memoryReclaimableFromStorage))
        storagePool.decrementPoolSize(spaceToReclaim)
        executionPool.incrementPoolSize(spaceToReclaim)
      }
    }
  }

  // 計算儲存記憶體佔用的記憶體和儲存
  def computeMaxExecutionPoolSize(): Long = {
    maxMemory - math.min(storagePool.memoryUsed, storageRegionSize)
  }

  executionPool.acquireMemory(
    numBytes, taskAttemptId, maybeGrowExecutionPool, computeMaxExecutionPoolSize)
}
// 申請分配儲存記憶體
override def acquireStorageMemory(
    blockId: BlockId,
    numBytes: Long,
    memoryMode: MemoryMode): Boolean = synchronized {
  assertInvariants()
  assert(numBytes >= 0)
  // 跟據不同記憶體模式，構建不同的元件和初始值v
  val (executionPool, storagePool, maxMemory) = memoryMode match {
    case MemoryMode.ON_HEAP => (
      onHeapExecutionMemoryPool,
      onHeapStorageMemoryPool,
      maxOnHeapStorageMemory)
    case MemoryMode.OFF_HEAP => (
      offHeapExecutionMemoryPool,
      offHeapStorageMemoryPool,
      maxOffHeapMemory)
  }
  // 如果要申請的記憶體空間大於最大記憶體空間，直接返回false
  if (numBytes > maxMemory) {
    logInfo(s"Will not store $blockId as the required space ($numBytes bytes) exceeds our " +
      s"memory limit ($maxMemory bytes)")
    return false
  }
  // 如果要申請的記憶體空間比當前storage剩餘空間多，不夠用則去向execution借
  if (numBytes > storagePool.memoryFree) {
    // 表示沒有足夠記憶體，需要從執行快取借一些資料，增加storage記憶體，縮小execution記憶體
    val memoryBorrowedFromExecution = Math.min(executionPool.memoryFree, numBytes)
    executionPool.decrementPoolSize(memoryBorrowedFromExecution)
    storagePool.incrementPoolSize(memoryBorrowedFromExecution)
  }
  storagePool.acquireMemory(blockId, numBytes)
}

# acquireStorageMemory 申請分配儲存記憶體

override def acquireStorageMemory(
    blockId: BlockId,
    numBytes: Long,
    memoryMode: MemoryMode): Boolean = synchronized {
  assertInvariants()
  assert(numBytes >= 0)
  // 跟據不同記憶體模式，構建不同的元件和初始值v
  val (executionPool, storagePool, maxMemory) = memoryMode match {
    case MemoryMode.ON_HEAP => (
      onHeapExecutionMemoryPool,
      onHeapStorageMemoryPool,
      maxOnHeapStorageMemory)
    case MemoryMode.OFF_HEAP => (
      offHeapExecutionMemoryPool,
      offHeapStorageMemoryPool,
      maxOffHeapMemory)
  }
  // 如果要申請的記憶體空間大於最大記憶體空間，直接返回false
  if (numBytes > maxMemory) {
    logInfo(s"Will not store $blockId as the required space ($numBytes bytes) exceeds our " +
      s"memory limit ($maxMemory bytes)")
    return false
  }
  // 如果要申請的記憶體空間比當前storage剩餘空間多，不夠用則去向execution借
  if (numBytes > storagePool.memoryFree) {
    // 表示沒有足夠記憶體，需要從執行快取借一些資料，增加storage記憶體，縮小execution記憶體
    val memoryBorrowedFromExecution = Math.min(executionPool.memoryFree, numBytes)
    executionPool.decrementPoolSize(memoryBorrowedFromExecution)
    storagePool.incrementPoolSize(memoryBorrowedFromExecution)
  }
  storagePool.acquireMemory(blockId, numBytes)
}

# getMaxMemory 返回最大的記憶體

private def getMaxMemory(conf: SparkConf): Long = {
  // 獲取系統記憶體
  val systemMemory = conf.getLong("spark.testing.memory", Runtime.getRuntime.maxMemory)
  // 獲取預留的記憶體
  val reservedMemory = conf.getLong("spark.testing.reservedMemory",
    if (conf.contains("spark.testing")) 0 else RESERVED_SYSTEM_MEMORY_BYTES)
  // 最小的系統記憶體
  val minSystemMemory = (reservedMemory * 1.5).ceil.toLong
  // 如果系統記憶體 < 最小的系統記憶體，丟擲異常
  if (systemMemory < minSystemMemory) {
    throw new IllegalArgumentException(s"System memory $systemMemory must " +
      s"be at least $minSystemMemory. Please increase heap size using the --driver-memory " +
      s"option or spark.driver.memory in Spark configuration.")
  }
  // 如果指定了executor記憶體
  if (conf.contains("spark.executor.memory")) {
    // 獲取executor記憶體
    val executorMemory = conf.getSizeAsBytes("spark.executor.memory")
    // 如果executor記憶體 < 最小的系統記憶體丟擲異常
    if (executorMemory < minSystemMemory) {
      throw new IllegalArgumentException(s"Executor memory $executorMemory must be at least " +
        s"$minSystemMemory. Please increase executor memory using the " +
        s"--executor-memory option or spark.executor.memory in Spark configuration.")
    }
  }
  // 系統記憶體 - 預留的系統記憶體 = 可使用的記憶體
  val usableMemory = systemMemory - reservedMemory
  // 可使用的JVM堆記憶體比例，預設60%
  val memoryFraction = conf.getDouble("spark.memory.fraction", 0.6)
  // 返回可使用的堆記憶體比例 * 可使用的記憶體
  (usableMemory * memoryFraction).toLong
}

Spark原始碼分析之BlockStore
2017-11-11
Spark原始碼BloC
Spark原始碼分析之DiskBlockMangaer分析
2017-11-10
Spark原始碼BloC
Spark原始碼分析之cahce原理分析
2017-11-11
Spark原始碼
spark 原始碼分析之十三 -- SerializerManager剖析
2019-07-15
Spark原始碼
Spark原始碼分析之Checkpoint機制
2017-11-11
Spark原始碼
spark 原始碼分析之十八 -- Spark儲存體系剖析
2019-07-23
Spark原始碼
spark 原始碼分析之十五 -- Spark記憶體管理剖析
2019-07-17
Spark原始碼記憶體
spark 原始碼分析之十九 -- Stage的提交
2019-07-26
Spark原始碼
spark 原始碼分析之十六 -- Spark記憶體儲存剖析
2019-07-18
Spark原始碼記憶體
Spark 原始碼分析系列
2019-07-28
Spark原始碼
Spark原始碼分析之BlockManager通訊機制
2017-11-10
Spark原始碼BloC
Spark原始碼分析之Worker啟動通訊機制
2017-11-09
Spark原始碼
Spark job分配流程原始碼分析
2015-10-13
Spark原始碼
Spark RPC框架原始碼分析（三）Spark心跳機制分析
2019-01-17
SparkRPC框架原始碼
Spark 原始碼解析之SparkContext
2016-12-09
Spark原始碼Context
Spark原始碼解析之Shuffle Writer
2017-12-24
Spark原始碼
Spark原始碼解析之Storage模組
2015-11-17
Spark原始碼
spark 原始碼分析之十四 -- broadcast 是如何實現的？
2019-07-16
Spark原始碼AST
spark core原始碼分析3 Master HA
2016-01-29
Spark原始碼AST
spark 原始碼分析之十九 -- DAG的生成和Stage的劃分
2019-07-25
Spark原始碼
Guava 原始碼分析之 EventBus 原始碼分析
2018-08-01
Guava原始碼
Android 原始碼分析之 AsyncTask 原始碼分析
2019-03-04
Android原始碼
spark原始碼之任務提交過程
2018-10-15
Spark原始碼
Spark RPC框架原始碼分析（一）簡述
2019-02-26
SparkRPC框架原始碼
spark streaming原始碼分析1 StreamingContext
2016-01-29
Spark原始碼GCContext
spark core原始碼分析2 master啟動流程
2016-01-29
Spark原始碼AST
spark core原始碼分析4 worker啟動流程
2016-01-29
Spark原始碼
Spark on Yarn 任務提交流程原始碼分析
2015-10-21
SparkYarn原始碼
原始碼分析之 HashMap
2019-03-04
原始碼HashMap
原始碼分析之AbstractQueuedSynchronizer
2017-03-28
原始碼
原始碼分析之ArrayList
2017-04-12
原始碼
原始碼|jdk原始碼之HashMap分析(一)
2019-01-19
原始碼JDKHashMap
原始碼|jdk原始碼之HashMap分析(二)
2019-01-19
原始碼JDKHashMap
redis原始碼分析（二）、redis原始碼分析之sds字串
2017-11-12
Redis原始碼字串
Spark原始碼-SparkContext原始碼解析
2017-09-24
Spark原始碼Context
spark streaming原始碼分析4 DStream相關API
2016-01-29
Spark原始碼API
死磕 jdk原始碼之HashMap原始碼分析
2019-04-13
JDK原始碼HashMap
JUC之CountDownLatch原始碼分析
2020-05-14
CountDownLatch原始碼

Spark原始碼分析之MemoryManager

相關文章