iOS探索 cache_t分析

我是好寶寶發表於2020-01-24

原文網址 : https://juejin.im/post/5e26afb9f265da3e2b2d7b26

歡迎閱讀iOS探索系列（按序閱讀食用效果更加）

iOS探索 alloc流程

iOS探索記憶體對齊&malloc原始碼

iOS探索 isa初始化&指向分析

iOS探索類的結構分析

iOS探索 cache_t分析

iOS探索方法的本質和方法查詢流程

iOS探索動態方法解析和訊息轉發機制

iOS探索淺嘗輒止dyld載入流程

iOS探索類的載入過程

iOS探索分類、類擴充的載入過程

iOS探索 isa面試題分析

iOS探索 runtime面試題分析

iOS探索 KVC原理及自定義

iOS探索 KVO原理及自定義

寫在前面

在上一篇文章中已經全面地介紹了類的結構，但是還剩下一個cache_t cache沒有進行詳細的介紹，本文就將從原始碼層面分析cache_t

一、初探cache_t

1.cache_t結構

如下是類在底層的結構

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
    
    class_rw_t *data() { 
        return bits.data();
    }
    ...
}
複製程式碼

其中cache_t的結構如下

struct cache_t {
    struct bucket_t *_buckets;
    mask_t _mask;
    mask_t _occupied;
    ...
};
複製程式碼

之前文章也說過，從cache_t的結構中可以得出它是由兩個uint32_t型別的_mask和_occupied以及bucket_t型別的結構體指標所組成的

struct bucket_t {
private:
    // IMP-first is better for arm64e ptrauth and no worse for arm64.
    // SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
    MethodCacheIMP _imp;
    cache_key_t _key;
#else
    cache_key_t _key;
    MethodCacheIMP _imp;
#endif

public:
    inline cache_key_t key() const { return _key; }
    inline IMP imp() const { return (IMP)_imp; }
    inline void setKey(cache_key_t newKey) { _key = newKey; }
    inline void setImp(IMP newImp) { _imp = newImp; }

    void set(cache_key_t newKey, IMP newImp);
};
複製程式碼

從以上bucket_t的屬性和方法中可以看出它應該與imp有聯絡——事實上bucket_t作為一個桶，裡面是用來裝imp方法實現以及它的key

cache_t中的_buckets、_mask、_occupied從字面意思上理解為桶、面具、佔據，但是我們不知道這三個的作用是否與他們的名字有關係，下面我們先從LLDB列印一些資訊來看看

2.LLDB除錯

在objc原始碼準備好程式碼

#import <objc/runtime.h>

@interface FXPerson : NSObject
- (void)doFirst;
- (void)doSecond;
- (void)doThird;
@end

@implementation FXPerson
- (void)doFirst {}
- (void)doSecond {}
- (void)doThird {}
@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        FXPerson *p = [[FXPerson alloc] init];
        Class cls = object_getClass(p);
        
        [p doFirst];
        [p doSecond];
        [p doThird];
    }
    return 0;
}
複製程式碼

_buckets是一個裝imp方法實現的桶，那我們在方法呼叫的時候打個斷點（上篇文章講過，類中isa指標佔8位元組，superclass指標佔8位元組，只要拿到類的首地址+16位元組就能得到cache_t的地址）

此時_mask為3，_occupied為1，我們繼續列印_buckets

列印了多個$3只發現快取了一個[NSObject init]，心中不免有了一個想法

斷點來到[p doSecond];一行（筆者這裡重新跑專案了）

斷點來到[p doThird];一行，得到如下資料：

斷點處	_occupied	_buckets包含方法
[p doFirst]	1	-[NSObject init]
[p doSecond]	2	-[NSObject init]、-[FXPerson doFirst]
[p doThird]	3	-[NSObject init]、-[FXPerson doFirst]、-[FXPerson doSecond]

上述資料可以得出_buckets是個裝方法實現的桶子，_occupied數值是桶子中有多少個方法實現

等等，這裡肯定有人還有疑問，FXPerson呼叫了alloc方法，怎麼都沒快取——上一篇文章已經講過了，alloc方法屬於類方法，存在FXPerson元類中

本以為一切都順順利利的時候，意外發生了——斷點走到下一行

_mask和_occupied都發生了不可思議的變化，那麼底層到底做了什麼呢？為什麼先前列印bucket[0]的時候全為空呢？

二、深入cache_t

0.找到切入點

已知_mask的值是增加了，所以我們找到cache_t中的mask_t mask()方法，結果只返回了_mask本身

mask_t cache_t::mask() 
{
    return _mask; 
}
複製程式碼

繼續搜尋mask()方法，發現在capacity方法中有mask的相應操作，但是操作目的不是很明確

mask_t cache_t::capacity() 
{
    return mask() ? mask()+1 : 0; 
}
複製程式碼

繼續搜尋capacity()方法，在expand方法中看到了capacity方法的有意義呼叫

void cache_t::expand()
{
    cacheUpdateLock.assertLocked();
    
    uint32_t oldCapacity = capacity();
    uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;

    if ((uint32_t)(mask_t)newCapacity != newCapacity) {
        // mask overflow - can't grow further
        // fixme this wastes one bit of mask
        newCapacity = oldCapacity;
    }

    reallocate(oldCapacity, newCapacity);
}
複製程式碼

expand方法應該是個擴容方法，繼續往上摸，摸到了cache_fill_nolock

static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver)
{
    cacheUpdateLock.assertLocked();

    // Never cache before +initialize is done
    if (!cls->isInitialized()) return;

    // Make sure the entry wasn't added to the cache by some other thread 
    // before we grabbed the cacheUpdateLock.
    if (cache_getImp(cls, sel)) return;

    cache_t *cache = getCache(cls);
    cache_key_t key = getKey(sel);

    // Use the cache as-is if it is less than 3/4 full
    mask_t newOccupied = cache->occupied() + 1;
    mask_t capacity = cache->capacity();
    if (cache->isConstantEmptyCache()) {
        // Cache is read-only. Replace it.
        cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE);
    }
    else if (newOccupied <= capacity / 4 * 3) {
        // Cache is less than 3/4 full. Use it as-is.
    }
    else {
        // Cache is too full. Expand it.
        cache->expand();
    }

    // Scan for the first unused slot and insert there.
    // There is guaranteed to be an empty slot because the 
    // minimum size is 4 and we resized at 3/4 full.
    bucket_t *bucket = cache->find(key, receiver);
    if (bucket->key() == 0) cache->incrementOccupied();
    bucket->set(key, imp);
}
複製程式碼

加個斷點在函式呼叫棧中驗證了我們找的方向是正確的

1.cache_fill_nolock

cache_fill_nolock方法比較複雜，筆者這裡將一步步分析

①if (!cls->isInitialized()) return;

類是否初始化物件，沒有就返回

②if (cache_getImp(cls, sel)) return;

傳入cls和sel，如果在快取中查詢到imp就返回，不能就下一步

③cache_t *cache = getCache(cls);

呼叫getCache來獲取cls的快取物件

④cache_key_t key = getKey(sel);

通過getKey來獲取到快取的key——其實是將SEL型別強轉成cache_key_t型別

⑤mask_t newOccupied = cache->occupied() + 1;

在cache已經佔用的基礎上進行加 1，得到的是新的快取佔用大小 newOccupied

⑥mask_t capacity = cache->capacity();

讀取現在快取的容量capacity

⑥判斷快取佔用

if (cache->isConstantEmptyCache()) {
    // Cache is read-only. Replace it.
    cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE);
}
else if (newOccupied <= capacity / 4 * 3) {
    // Cache is less than 3/4 full. Use it as-is.
}
else {
    // Cache is too full. Expand it.
    cache->expand();
}
複製程式碼

如果快取為空，重新申請一下記憶體並覆蓋之前的快取
如果新的快取佔用大小<=快取容量的四分之三，則可以進行快取流程
如果快取不為空，且快取佔用大小已經超過了容量的四分之三，則需要進行擴容

⑦bucket_t *bucket = cache->find(key, receiver);

通過key在快取中查詢到對應的bucket_t

⑧if (bucket->key() == 0) cache->incrementOccupied();

如果⑦找到的bucket中key為0，那麼_occupied++

⑨bucket->set(key, imp);

把key、imp成對放入bucket

總結：

cache_fill_nolock先找到類的快取cache，如果快取cache為空就建立並覆蓋；如果目標占用（快取之後的佔用大小newOccupied）大於快取容量的四分之三，先擴容再裝入對應key值的桶內bucket；否則直接裝入對應key值的桶內bucket

分析完cache_fill_nolock主流程，再根據一些方法進行擴充套件

2.cache_t::reallocate

void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity)
{
    bool freeOld = canBeFreed();

    bucket_t *oldBuckets = buckets();
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    assert(newCapacity > 0);
    assert((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    setBucketsAndMask(newBuckets, newCapacity - 1);
    
    if (freeOld) {
        cache_collect_free(oldBuckets, oldCapacity);
        cache_collect(false);
    }
}

bucket_t *allocateBuckets(mask_t newCapacity)
{
    // Allocate one extra bucket to mark the end of the list.
    // This can't overflow mask_t because newCapacity is a power of 2.
    // fixme instead put the end mark inline when +1 is malloc-inefficient
    bucket_t *newBuckets = (bucket_t *)
        calloc(cache_t::bytesForCapacity(newCapacity), 1);

    bucket_t *end = cache_t::endMarker(newBuckets, newCapacity);

#if __arm__
    // End marker's key is 1 and imp points BEFORE the first bucket.
    // This saves an instruction in objc_msgSend.
    end->setKey((cache_key_t)(uintptr_t)1);
    end->setImp((IMP)(newBuckets - 1));
#else
    // End marker's key is 1 and imp points to the first bucket.
    end->setKey((cache_key_t)(uintptr_t)1);
    end->setImp((IMP)newBuckets);
#endif
    
    if (PrintCaches) recordNewCache(newCapacity);

    return newBuckets;
}

void cache_t::setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask)
{
    // objc_msgSend uses mask and buckets with no locks.
    // It is safe for objc_msgSend to see new buckets but old mask.
    // (It will get a cache miss but not overrun the buckets' bounds).
    // It is unsafe for objc_msgSend to see old buckets and new mask.
    // Therefore we write new buckets, wait a lot, then write new mask.
    // objc_msgSend reads mask first, then buckets.

    // ensure other threads see buckets contents before buckets pointer
    mega_barrier();

    _buckets = newBuckets;
    
    // ensure other threads see new buckets before new mask
    mega_barrier();
    
    _mask = newMask;
    _occupied = 0;
}
複製程式碼

先判斷能否被釋放（快取是否為空的取反值）並儲存
oldBuckets獲取到當前bucket
傳入新的快取容量allocateBuckets初始化bucket_t，儲存在newBuckets
setBucketsAndMask做的操作：用新建立的bucket儲存，mask=newcapcity-1，occupied置零（因為還沒有方法快取）
如果快取不為空（需要釋放）則釋放原先的bucket、capacity

為什麼使用cache_collect_free消除記憶，而不是重新讀寫、記憶體拷貝的方式？一是重新讀寫不安全；二是抹掉速度快

3.cache_t::expand

void cache_t::expand()
{
    cacheUpdateLock.assertLocked();
    
    uint32_t oldCapacity = capacity();
    uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;

    if ((uint32_t)(mask_t)newCapacity != newCapacity) {
        // mask overflow - can't grow further
        // fixme this wastes one bit of mask
        newCapacity = oldCapacity;
    }

    reallocate(oldCapacity, newCapacity);
}

enum {
    INIT_CACHE_SIZE_LOG2 = 2,
    INIT_CACHE_SIZE      = (1 << INIT_CACHE_SIZE_LOG2)
};

mask_t cache_t::capacity() 
{
    return mask() ? mask()+1 : 0; 
}
複製程式碼

oldCapacity的值為mask+1
在oldCapacity存在的情況下，newCapacity取oldCapacity的兩倍；否則取INIT_CACHE_SIZE
這裡的INIT_CACHE_SIZE為二進位制的100=>十進位制的4
建立並覆蓋原來的快取reallocate

4.cache_t::find

cache_t::find是找對應的儲存桶

bucket_t * cache_t::find(cache_key_t k, id receiver)
{
    assert(k != 0);

    bucket_t *b = buckets();
    mask_t m = mask();
    mask_t begin = cache_hash(k, m);
    mask_t i = begin;
    do {
        if (b[i].key() == 0  ||  b[i].key() == k) {
            return &b[i];
        }
    } while ((i = cache_next(i, m)) != begin);

    // hack
    Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache));
    cache_t::bad_cache(receiver, (SEL)k, cls);
}
複製程式碼

通過buckets()方法獲取當前cache_t下所有的快取桶bucket
通過mask()方法獲取當前cache_t的快取容量減一的值mask_t
key & mask計算出起始索引
begin賦值給i，用於切換索引
在do-while迴圈裡遍歷整個bucket_t，如果key = 0，說明在索引i的位置上還沒有快取過方法，同樣需要返回該bucket_t，用於中止快取查詢；如果取出來的bucket_t的key = k，則查詢成功，返回該bucket_t
通過cache_next返回i-1來更新索引，以此來查詢雜湊表中的每一個元素（相當於繞圈）
如果找不到證明快取有問題，返回bad_cache

5.LRU演算法

LRU演算法的全稱是Least Recently Used，也就是最近最少使用策略——這個策略的核心思想就是先淘汰最近最少使用的內容，在方法快取中也用到了這種演算法

在擴容前，例項方法隨便選擇位置坐下
在擴容後，新的例項方法找到最近最少使用的位置坐下並清掉之前的bucket

三、cache_t疑問點

1.mask的作用

mask是作為cache_t的屬性存在的，它代表的是快取容量的大小減一的值
mask對於bucket來說，主要是用來在快取查詢時的雜湊演算法

2.capacity的變化

capacity的變化主要發生在擴容cache->expand()的時候，當快取已經佔滿了四分之三的時候，會進行兩倍原來快取空間大小的擴容，這一步是為了避免雜湊衝突

3.為什麼是在 3/4 時進行擴容

在雜湊這種資料結構裡面，有一個概念用來表示空位的多少叫做裝載因子——裝載因子越大，說明空閒位置越少，衝突越多，雜湊表的效能會下降

負載因子是3/4的時候，空間利用率比較高，而且避免了相當多的Hash衝突，提升了空間效率

具體可以閱讀HashMap的負載因子為什麼預設是0.75？

4.方法快取是否有序

static inline mask_t cache_hash(cache_key_t key, mask_t mask) 
{
    return (mask_t)(key & mask);
}
複製程式碼

方法快取是無序的，因為是用雜湊演算法來計算快取下標——下標值取決於key 和mask的值

5.bucket與mask、capacity、sel、imp的關係

類cls擁有屬性cache_t，cache_t中的buckets有多個bucket——儲存著方法實現imp和方法編號sel強轉成的key值cache_key_t
mask對於bucket來說，主要是用來在快取查詢時的雜湊演算法
capacity則可以獲取到cache_t中bucket的數量

快取的主要目的就是通過一系列策略讓編譯器更快的執行訊息傳送的邏輯

寫在後面

關於cache_t的內容雖然不多但還是蠻繞的，多讀讀原始碼會有更深的理解。下篇文章講objc_msgsend，作為cache_fill_nolock前置方法，一定程式上會對cache_t的理解有所幫助

iOS探索類的結構分析
2020-01-21
iOS
iOS探索 runtime面試題分析
2020-03-02
iOS面試題
iOS探索 isa初始化&指向分析
2020-01-06
iOS
iOS gRPC 初步探索
2018-10-11
iOSRPC
iOS探索 alloc流程
2019-12-22
iOS
iOS探索：Block解析淺談
2018-12-19
iOSBloC
iOS 底層探索之Runloop
2019-09-07
iOSOOP
iOS探索：網路相關
2018-12-28
iOS
iOS 效能優化的探索
2018-03-08
iOS優化
springBoot探索(1)——分析
2019-03-03
Spring Boot
IO效能探索分析
2019-05-10
iOS探索 KVO原理及自定義
2020-03-15
iOS
iOS探索 KVC原理及自定義
2020-03-10
iOS
iOS模組化探索實踐
2018-04-16
iOS
iOS外掛化架構探索
2021-08-04
iOS架構
探索 react-native run-ios(android)
2019-03-04
ReactiOSAndroid
探索iOS中Block的實現原理
2019-04-19
iOSBloC
iOS探索：Runtime之基本資料結構
2018-12-11
iOS資料結構
iOS記憶體深入探索之VM Tracker
2019-02-27
iOS記憶體
iOS APP效能分析
2018-05-16
iOSAPP
iOS探索記憶體對齊&malloc原始碼
2020-01-02
iOS記憶體原始碼
iOS自動化測試驅動工具探索
2022-03-03
iOS
iOS&Flutter混合開發的探索歷程
2021-06-01
iOSFlutter
iOS模式分析策略模式
2019-04-04
iOS模式
iOS 流量監控分析
2018-06-07
iOS
iOS APP包分析工具
2023-11-24
iOSAPP
Swift iOS : 程式碼分析DrawController
2019-02-23
SwiftiOSController
Flutter Android/iOS包大小分析
2018-12-13
FlutterAndroidiOS
IOS 崩潰日誌分析
2018-08-08
iOS
iOS探索方法的本質和訊息查詢流程
2020-02-04
iOS
深入探索Spring AI：原始碼分析流式回答
2024-10-11
SpringAI原始碼
iOS 如何分析crash log 官方文件
2018-09-10
iOS
有米iOS惡意SDK分析
2020-08-19
iOS
瞭解和分析iOS Crash Report
2019-02-09
iOS
iOS探索：UI檢視之事件傳遞&檢視響應
2018-12-05
iOSUI事件
iOS探索動態方法解析和訊息轉發機制
2020-02-12
iOS
iOS探索：Runtime之訊息轉發及動態新增方法
2018-12-13
iOS
iOS探索：UI檢視之卡頓、掉幀及繪製原理
2018-12-06
iOSUI