Python 的記憶體分配策略
arena
arena: 多個pool聚合的結果
arena size
pool的大小預設值位4KB
arena的大小預設值256KB, 能放置 256/4=64 個pool
obmalloc.c
中程式碼
1 |
#define ARENA_SIZE (256 << 10) /* 256KB */ |
arena 結構
一個完整的arena = arena_object + pool集合
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
typedef uchar block; /* Record keeping for arenas. */ struct arena_object { /* The address of the arena, as returned by malloc. Note that 0 * will never be returned by a successful malloc, and is used * here to mark an arena_object that doesn't correspond to an * allocated arena. */ uptr address; /* Pool-aligned pointer to the next pool to be carved off. */ block* pool_address; /* The number of available pools in the arena: free pools + never- * allocated pools. */ uint nfreepools; /* The total number of pools in the arena, whether or not available. */ uint ntotalpools; /* Singly-linked list of available pools. */ // 單連結串列, 可用pool集合 struct pool_header* freepools; /* Whenever this arena_object is not associated with an allocated * arena, the nextarena member is used to link all unassociated * arena_objects in the singly-linked `unused_arena_objects` list. * The prevarena member is unused in this case. * * When this arena_object is associated with an allocated arena * with at least one available pool, both members are used in the * doubly-linked `usable_arenas` list, which is maintained in * increasing order of `nfreepools` values. * * Else this arena_object is associated with an allocated arena * all of whose pools are in use. `nextarena` and `prevarena` * are both meaningless in this case. */ // arena連結串列 struct arena_object* nextarena; struct arena_object* prevarena; }; |
arena_object的作用
1 2 3 |
1. 與其他arena連線, 組成雙向連結串列 2. 維護arena中可用的pool, 單連結串列 3. 其他資訊 |
pool_header
與 arena_object
1 2 |
pool_header和管理的blocks記憶體是一塊連續的記憶體 => pool_header被申請時, 其管理的block集合的記憶體一併被申請 arena_object和其管理的記憶體是分離的 => arena_object被申請時, 其管理的pool集合的記憶體沒有被申請, 而是在某一時刻建立的聯絡 |
arena的兩種狀態
arena存在兩種狀態: 未使用(沒有建立聯絡)/可用(建立了聯絡)
全域性由兩個連結串列維護著
1 2 3 4 5 6 7 8 9 10 11 |
/* The head of the singly-linked, NULL-terminated list of available * arena_objects. */ // 單連結串列 static struct arena_object* unused_arena_objects = NULL; /* The head of the doubly-linked, NULL-terminated at each end, list of * arena_objects associated with arenas that have pools available. */ // 雙向連結串列 static struct arena_object* usable_arenas = NULL; |
arena的初始化
首先, 來看下初始化相關的一些引數定義
程式碼obmalloc.c
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
/* Array of objects used to track chunks of memory (arenas). */ // arena_object 陣列 static struct arena_object* arenas = NULL; /* Number of slots currently allocated in the `arenas` vector. */ // 當前arenas中管理的arena_object的個數, 初始化時=0 static uint maxarenas = 0; /* How many arena_objects do we initially allocate? * 16 = can allocate 16 arenas = 16 * ARENA_SIZE = 4MB before growing the * `arenas` vector. */ // 初始化時申請的arena_object個數 #define INITIAL_ARENA_OBJECTS 16 /* Number of arenas allocated that haven't been free()'d. */ static size_t narenas_currently_allocated = 0; /* The head of the singly-linked, NULL-terminated list of available * arena_objects. */ // 未使用狀態arena的單連結串列 static struct arena_object* unused_arena_objects = NULL; /* The head of the doubly-linked, NULL-terminated at each end, list of * arena_objects associated with arenas that have pools available. */ // 可用狀態arena的雙向連結串列 static struct arena_object* usable_arenas = NULL; |
然後, 看下obmalloc.c
中arena
初始化的程式碼
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
/* Allocate a new arena. If we run out of memory, return NULL. Else * allocate a new arena, and return the address of an arena_object * describing the new arena. It's expected that the caller will set * `usable_arenas` to the return value. */ static struct arena_object* new_arena(void) { struct arena_object* arenaobj; uint excess; /* number of bytes above pool alignment */ void *address; int err; // 判斷是否需要擴充"未使用"的arena_object列表 if (unused_arena_objects == NULL) { uint i; uint numarenas; size_t nbytes; /* Double the number of arena objects on each allocation. * Note that it's possible for `numarenas` to overflow. */ // 確定需要申請的個數, 首次初始化, 16, 之後每次翻倍 numarenas = maxarenas ? maxarenas 1 : INITIAL_ARENA_OBJECTS; if (numarenas maxarenas) return NULL; /* overflow */ //溢位了 .... nbytes = numarenas * sizeof(*arenas); // 申請記憶體 arenaobj = (struct arena_object *)realloc(arenas, nbytes); if (arenaobj == NULL) return NULL; arenas = arenaobj; /* We might need to fix pointers that were copied. However, * new_arena only gets called when all the pages in the * previous arenas are full. Thus, there are *no* pointers * into the old array. Thus, we don't have to worry about * invalid pointers. Just to be sure, some asserts: */ assert(usable_arenas == NULL); assert(unused_arena_objects == NULL); // 初始化 /* Put the new arenas on the unused_arena_objects list. */ for (i = maxarenas; i numarenas; ++i) { arenas[i].address = 0; /* mark as unassociated */ // 新申請的一律為0, 標識著這個arena處於"未使用" arenas[i].nextarena = i numarenas - 1 ? &arenas[i+1] : NULL; } // 將其放入unused_arena_objects連結串列中 // unused_arena_objects 為新分配記憶體空間的開頭 /* Update globals. */ unused_arena_objects = &arenas[maxarenas]; // 更新數量 maxarenas = numarenas; } /* Take the next available arena object off the head of the list. */ assert(unused_arena_objects != NULL); // 從unused_arena_objects中, 獲取一個未使用的object arenaobj = unused_arena_objects; unused_arena_objects = arenaobj->nextarena; // 更新連結串列 // 開始處理這個 arenaobject assert(arenaobj->address == 0); // 申請記憶體, 256KB, 記憶體地址賦值給arena的address. 這塊記憶體可用 #ifdef ARENAS_USE_MMAP address = mmap(NULL, ARENA_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); err = (address == MAP_FAILED); #else address = malloc(ARENA_SIZE); err = (address == 0); #endif if (err) { /* The allocation failed: return NULL after putting the * arenaobj back. */ arenaobj->nextarena = unused_arena_objects; unused_arena_objects = arenaobj; return NULL; } arenaobj->address = (uptr)address; ++narenas_currently_allocated; // 設定pool集合相關資訊 arenaobj->freepools = NULL; // 設定為NULL, 只有在釋放一個pool的時候才有用 /* pool_address first pool-aligned address in the arena nfreepools number of whole pools that fit after alignment */ arenaobj->pool_address = (block*)arenaobj->address; arenaobj->nfreepools = ARENA_SIZE / POOL_SIZE; assert(POOL_SIZE * arenaobj->nfreepools == ARENA_SIZE); // 將pool的起始地址調整為系統頁的邊界 // 申請到 256KB, 放棄了一些記憶體, 而將可使用的記憶體邊界pool_address調整到了與系統頁對齊 excess = (uint)(arenaobj->address & POOL_SIZE_MASK); if (excess != 0) { --arenaobj->nfreepools; arenaobj->pool_address += POOL_SIZE - excess; } arenaobj->ntotalpools = arenaobj->nfreepools; return arenaobj; } |
圖示: 初始化arenas陣列, 初始化後的所有arena都在unused_arena_objects
單連結串列裡面
圖示: 從arenas取一個arena進行初始化
沒有可用的arena?
此時
1 2 3 4 5 |
// 判斷成立 if (unused_arena_objects == NULL) { .... // 確定需要申請的個數, 首次初始化, 16, 之後每次翻倍 numarenas = maxarenas ? maxarenas << 1 : INITIAL_ARENA_OBJECTS; |
然後, 假設第一次分配了16個, 發現沒有arena之後, 第二次處理結果: numarenas = 32
即, 陣列擴大了一倍
arena分配
new
了一個全新的 arena之後,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
void * PyObject_Malloc(size_t nbytes) { // 剛開始沒有可用的arena if (usable_arenas == NULL) { // new一個, 作為雙向連結串列的表頭 usable_arenas = new_arena(); if (usable_arenas == NULL) { UNLOCK(); goto redirect; } usable_arenas->nextarena = usable_arenas->prevarena = NULL; } ....... // 從arena中獲取一個pool pool = (poolp)usable_arenas->pool_address; assert((block*)pool <= (block*)usable_arenas->address + ARENA_SIZE - POOL_SIZE); pool->arenaindex = usable_arenas - arenas; assert(&arenas[pool->arenaindex] == usable_arenas); pool->szidx = DUMMY_SIZE_IDX; // 更新 pool_address 向下一個節點 usable_arenas->pool_address += POOL_SIZE; // 可用節點數量-1 --usable_arenas->nfreepools; } |
圖示: 從全新的arena中獲取一個pool
假設arena是舊的, 怎麼分配的pool
1 2 |
pool = usable_arenas->freepools; if (pool != NULL) { |
這個arena->freepools
是何方神聖?
當arena中一整塊pool被釋放的時候
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
void PyObject_Free(void *p) { struct arena_object* ao; uint nf; /* ao->nfreepools */ /* Link the pool to freepools. This is a singly-linked * list, and pool->prevpool isn't used there. */ ao = &arenas[pool->arenaindex]; pool->nextpool = ao->freepools; ao->freepools = pool; nf = ++ao->nfreepools; |
也就是說, 在pool整塊被釋放的時候, 會將pool加入到arena->freepools
作為單連結串列的表頭, 然後, 在從非全新arena中分配pool時, 優先從arena->freepools
裡面取, 如果取不到, 再從arena記憶體塊裡面獲取
圖示
一個arena滿了之後呢
很自然, 從下一個arena中獲取
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
void * PyObject_Malloc(size_t nbytes) { // 當發現用完了最後一個pool!!!!!!!!!!! // nfreepools = 0 if (usable_arenas->nfreepools == 0) { assert(usable_arenas->nextarena == NULL || usable_arenas->nextarena->prevarena == usable_arenas); /* Unlink the arena: it is completely allocated. */ // 找到下一個節點! usable_arenas = usable_arenas->nextarena; // 右下一個 if (usable_arenas != NULL) { usable_arenas->prevarena = NULL; // 更新下一個節點的prevarens assert(usable_arenas->address != 0); } // 沒有下一個, 此時 usable_arenas = NULL, 下次進行記憶體分配的時候, 就會從arenas陣列中取一個 } } |
注意: 這裡有個邏輯, 就是每分配一個pool, 就檢查是不是用到了最後一個, 如果是, 需要變更usable_arenas
到下一個可用的節點, 如果沒有可用的, 那麼下次進行記憶體分配的時候, 會判定從arenas陣列中取一個
arena回收
記憶體分配和回收最小單位是block, 當一個block被回收的時候, 可能觸發pool被回收, pool被回收, 將會觸發arena的回收機制
四種情況
1 2 3 4 |
1. arena中所有pool都是閒置的(empty), 將arena記憶體釋放, 返回給作業系統 2. 如果arena中之前所有的pool都是佔用的(used), 現在釋放了一個pool(empty), 需要將 arena加入到usable_arenas, 會加入連結串列表頭 3. 如果arena中empty的pool個數n, 則從useable_arenas開始尋找可以插入的位置. 將arena插入. (useable_arenas是一個有序連結串列, 按empty pool的個數, 保證empty pool數量越多, 被使用的機率越小, 最終被整體釋放的機會越大) 4. 其他情況, 不對arena 進行處理 |
具體可以看PyObject_Free
的程式碼
記憶體分配步驟
好的, 到這裡, 我們已經知道了block和pool的關係(包括pool怎麼管理block的), 以及arena和pool的關係(怎麼從arena中拉到可用的pool)
那麼, 在分析PyObject_Malloc(size_t nbytes)
如何進行記憶體分配的時候, 我們就刨除掉這些管理程式碼
關注: 如何尋找得到一塊可用的nbytes的block記憶體
其實程式碼那麼多, 定址得到對應的block也就這麼幾行程式碼, 其他程式碼都是pool沒有, 找arena, 申請arena, arena沒有, 找arenas, 最終的到一塊pool, 初始化, 返回第一個block
如果有的情況, 用現成的
1 2 3 4 5 6 |
pool = usedpools[size + size]; if pool可用: pool 沒滿, 取一個block返回 pool 滿了, 從下一個pool取一個block返回 否則: 獲取arena, 從裡面初始化一個pool, 拿到第一個block, 返回 |
從上面這個判斷邏輯來看, 記憶體分配其實主要操作的是pool, 跟arena並不是基本的操作單元(只是用來管理pool的)
結論: 進行記憶體分配和銷燬, 所有操作都是在pool上進行的
usedpools
是什麼鬼? 其實是可用pool緩衝池, 後面說
記憶體池
arena 記憶體池的大小
取決於使用者, Python提供的編譯符號, 用於決定是否控制
obmalloc.c
1 2 3 4 5 6 7 8 9 |
#ifdef WITH_MEMORY_LIMITS #ifndef SMALL_MEMORY_LIMIT #define SMALL_MEMORY_LIMIT (64 * 1024 * 1024) /* 64 MB -- more? */ #endif #endif #ifdef WITH_MEMORY_LIMITS #define MAX_ARENAS (SMALL_MEMORY_LIMIT / ARENA_SIZE) #endif |
具體使用中, python並不直接與arenas和arena打交道, 當Python申請記憶體時, 最基本的操作單元並不是arena, 而是pool
問題: pool中所有block的size一樣, 但是在arena中, 每個pool的size都可能不一樣, 那麼最終這些pool是怎麼維護的? 怎麼根據大小找到需要的block所在的pool? => usedpools
pool在記憶體池中的三種狀態
1 2 3 4 5 6 |
1. used狀態: pool中至少有一個block已經被使用, 並且至少有一個block未被使用. 這種狀態的pool受控於Python內部維護的usedpool陣列 2. full狀態: pool中所有的block都已經被使用, 這種狀態的pool在arena中, 但不在arena的freepools連結串列中 處於full的pool各自獨立, 不會被連結串列維護起來 3. empty狀態: pool中所有block都未被使用, 處於這個狀態的pool的集合通過其pool_header中的nextpool構成一個連結串列, 連結串列的表頭是arena_object中的freepools |
usedpools
usedpools陣列: 維護著所有處於used狀態的pool, 當申請記憶體的時候, 會通過usedpools尋找到一塊可用的(處於used狀態的)pool, 從中分配一個block
結構:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
#define SMALL_REQUEST_THRESHOLD 512 // 512/8 = 64 #define NB_SMALL_SIZE_CLASSES (SMALL_REQUEST_THRESHOLD / ALIGNMENT) #define PTA(x) ((poolp )((uchar *)&(usedpools[2*(x)]) - 2*sizeof(block *))) #define PT(x) PTA(x), PTA(x) // 2 * ((64 + 7) / 8) * 8 = 128, 大小為128的陣列 static poolp usedpools[2 * ((NB_SMALL_SIZE_CLASSES + 7) / 8) * 8] = { PT(0), PT(1), PT(2), PT(3), PT(4), PT(5), PT(6), PT(7) #if NB_SMALL_SIZE_CLASSES > 8 , PT(8), PT(9), PT(10), PT(11), PT(12), PT(13), PT(14), PT(15) #if NB_SMALL_SIZE_CLASSES > 16 , PT(16), PT(17), PT(18), PT(19), PT(20), PT(21), PT(22), PT(23) #if NB_SMALL_SIZE_CLASSES > 24 , PT(24), PT(25), PT(26), PT(27), PT(28), PT(29), PT(30), PT(31) #if NB_SMALL_SIZE_CLASSES > 32 , PT(32), PT(33), PT(34), PT(35), PT(36), PT(37), PT(38), PT(39) #if NB_SMALL_SIZE_CLASSES > 40 , PT(40), PT(41), PT(42), PT(43), PT(44), PT(45), PT(46), PT(47) #if NB_SMALL_SIZE_CLASSES > 48 , PT(48), PT(49), PT(50), PT(51), PT(52), PT(53), PT(54), PT(55) #if NB_SMALL_SIZE_CLASSES > 56 , PT(56), PT(57), PT(58), PT(59), PT(60), PT(61), PT(62), PT(63) #if NB_SMALL_SIZE_CLASSES > 64 #error "NB_SMALL_SIZE_CLASSES should be less than 64" #endif /* NB_SMALL_SIZE_CLASSES > 64 */ #endif /* NB_SMALL_SIZE_CLASSES > 56 */ #endif /* NB_SMALL_SIZE_CLASSES > 48 */ #endif /* NB_SMALL_SIZE_CLASSES > 40 */ #endif /* NB_SMALL_SIZE_CLASSES > 32 */ #endif /* NB_SMALL_SIZE_CLASSES > 24 */ #endif /* NB_SMALL_SIZE_CLASSES > 16 */ #endif /* NB_SMALL_SIZE_CLASSES > 8 */ }; 即 // 得到usedpools陣列 static poolp usedpools[128] = { PTA(0), PTA(0), PTA(1), PTA(1), PTA(2), PTA(2), PTA(3), PTA(3), .... PTA(63), PTA(63) } |
解開看(obmalloc.c
)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
typedef uchar block; /* Pool for small blocks. */ struct pool_header { union { block *_padding; uint count; } ref; /* number of allocated blocks */ block *freeblock; /* pool's free list head */ struct pool_header *nextpool; /* next pool of this size class */ struct pool_header *prevpool; /* previous pool "" */ uint arenaindex; /* index into arenas of base adr */ uint szidx; /* block size class index */ uint nextoffset; /* bytes to virgin block */ uint maxnextoffset; /* largest valid nextoffset */ }; typedef struct pool_header *poolp; usedpools[0] = PTA(0) = ((poolp )((uchar *) |
為了看懂這步的trick, 心好累>_
直接上圖
new一個pool時維護
init
獲得的情況, 其實就是將剛剛從arena中獲取的pool加入到 usedpools 對應的雙向連結串列中, 然後初始化, 然後返回block
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
init_pool: /* Frontlink to used pools. */ // 1. 獲取得到usedpools連結串列頭 next = usedpools[size + size]; /* == prev */ // 2. 將新的pool加入到雙向連結串列 pool->nextpool = next; pool->prevpool = next; next->nextpool = pool; next->prevpool = pool; pool->ref.count = 1; // 3. 後面的是具體pool和block的了 if (pool->szidx == size) { /* Luckily, this pool last contained blocks * of the same size class, so its header * and free list are already initialized. */ bp = pool->freeblock; pool->freeblock = *(block **)bp; UNLOCK(); return (void *)bp; } /* * Initialize the pool header, set up the free list to * contain just the second block, and return the first * block. */ pool->szidx = size; size = INDEX2SIZE(size); bp = (block *)pool + POOL_OVERHEAD; pool->nextoffset = POOL_OVERHEAD + (size maxnextoffset = POOL_SIZE - size; pool->freeblock = bp + size; *(block **)(pool->freeblock) = NULL; UNLOCK(); return (void *)bp; // here } |
從現有pool中獲取block
從現有的pool, 其實就是 usedpools得到雙向連結串列頭部, 判斷是不是空連結串列, 不是的話代表有可用的pool, 直接從裡面獲取
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
if ((nbytes - 1) > ALIGNMENT_SHIFT; pool = usedpools[size + size]; // 注意這裡的判斷, pool != pool-> nextpool 表示得到的連結串列不是空的 if (pool != pool->nextpool) { /* * There is a used pool for this size class. * Pick up the head block of its free list. */ ++pool->ref.count; bp = pool->freeblock; assert(bp != NULL); if ((pool->freeblock = *(block **)bp) != NULL) { UNLOCK(); return (void *)bp; } /* * Reached the end of the free list, try to extend it. */ if (pool->nextoffset maxnextoffset) { /* There is room for another block. */ pool->freeblock = (block*)pool + pool->nextoffset; pool->nextoffset += INDEX2SIZE(size); *(block **)(pool->freeblock) = NULL; UNLOCK(); return (void *)bp; } /* Pool is full, unlink from used pools. */ next = pool->nextpool; pool = pool->prevpool; next->prevpool = pool; pool->nextpool = next; UNLOCK(); return (void *)bp; // here } |
全域性結構
先這樣吧, Python中整個記憶體池基本結構和機制大概如此, 是不是發現有好多陣列/連結串列等等, 在分配/回收上處理下做成各種池…..
後面還有記憶體相關的就是垃圾收集了, 後面再說了吧
wklken
2015-08-29
打賞支援我寫出更多好文章,謝謝!
打賞作者
打賞支援我寫出更多好文章,謝謝!
任選一種支付方式