iOS Out-Of-Memory 原理闡述及方案調研

Joy_xx發表於2018-12-30
iOS Out-Of-Memory 原理闡述及方案調研

什麼是 OOM?

OOM 的全稱是 Out-Of-Memory,是由於 iOS 的 Jetsam 機制造成的一種“另類” Crash,它不同於常規的 Crash,通過 Signal 捕獲等 Crash 監控方案無法捕獲到 OOM 事件。

為什麼會發生 oom?

目前猜測兩種情況會造成 OOM,

  1. 系統整體記憶體使用較高,系統基於優先順序殺死優先順序較低的 App
  2. 當前使用的 App 達到了 “high water mark”,也就是達到了系統對單個 App 的記憶體限制,系統會將你 Kill

驗證方案 1 :

XNU 中 
opensource.apple.com/source/xnu/… 
opensource.apple.com/source/xnu/… 
提供了一些函式和巨集,我們可以在 root 許可權下使用這些巨集和函式來獲取當前狀態下的所有 App 的 oom 記憶體閾值,並且基於 PID 甚至可以修改程式的 記憶體閾值,達到增大 oom記憶體閾值的效果。

對我們最有用的資訊如下:

// 獲取程式的 pid、優先順序、狀態、記憶體閾值等資訊typedef struct memorystatus_priority_entry { 
pid_t pid;
int32_t priority;
uint64_t user_data;
int32_t limit;
uint32_t state;

} memorystatus_priority_entry_t;
// 基於下面這些巨集可以達到查詢記憶體閾值等資訊,也可以修改記憶體閾值等/* Commands */#define MEMORYSTATUS_CMD_GET_PRIORITY_LIST 1#define MEMORYSTATUS_CMD_SET_PRIORITY_PROPERTIES 2#define MEMORYSTATUS_CMD_GET_JETSAM_SNAPSHOT 3#define MEMORYSTATUS_CMD_GET_PRESSURE_STATUS 4#define MEMORYSTATUS_CMD_SET_JETSAM_HIGH_WATER_MARK 5 /* Set active memory limit = inactive memory limit, both non-fatal */#define MEMORYSTATUS_CMD_SET_JETSAM_TASK_LIMIT 6 /* Set active memory limit = inactive memory limit, both fatal */#define MEMORYSTATUS_CMD_SET_MEMLIMIT_PROPERTIES 7 /* Set memory limits plus attributes independently */#define MEMORYSTATUS_CMD_GET_MEMLIMIT_PROPERTIES 8 /* Get memory limits plus attributes */#define MEMORYSTATUS_CMD_PRIVILEGED_LISTENER_ENABLE 9 /* Set the task's status as a privileged listener w.r.t memory notifications */#define MEMORYSTATUS_CMD_PRIVILEGED_LISTENER_DISABLE 10 /* Reset the task's status as a privileged listener w.r.t memory notifications *//* Commands that act on a group of processes */#define MEMORYSTATUS_CMD_GRP_SET_PROPERTIES 100複製程式碼

我們可以建立一個如下程式碼的程式

#include <
stdlib.h>
#include <
string.h>
#include <
stdio.h>
#include "kern_memorystatus.h"#define NUM_ENTRIES 1024char *state_to_text(int State){
// Convert kMemoryStatus constants to a textual representation static char returned[80];
sprintf (returned, "0x%02x ",State);
if (State &
kMemorystatusSuspended) strcat(returned,"Suspended,");
if (State &
kMemorystatusFrozen) strcat(returned,"Frozen,");
if (State &
kMemorystatusWasThawed) strcat(returned,"WasThawed,");
if (State &
kMemorystatusTracked) strcat(returned,"Tracked,");
if (State &
kMemorystatusSupportsIdleExit) strcat(returned,"IdleExit,");
if (State &
kMemorystatusDirty) strcat(returned,"Dirty,");
if (returned[strlen(returned) -1] == ',') returned[strlen(returned) -1] = '\0';
return (returned);

}int main (int argc, char **argv){
struct memorystatus_priority_entry memstatus[NUM_ENTRIES];
size_t count = sizeof(struct memorystatus_priority_entry) * NUM_ENTRIES;
// call memorystatus_control int rc = memorystatus_control (MEMORYSTATUS_CMD_GET_PRIORITY_LIST, // 1 - only supported command on OS X 0, // pid 0, // flags memstatus, // buffer count);
// buffersize if (rc <
0) {
perror ("memorystatus_control");
exit(rc);

} int entry = 0;
for (;
rc >
0;
rc -= sizeof(struct memorystatus_priority_entry)) {
printf ("PID: %5d\tPriority:%2d\tUser Data: %llx\tLimit:%2d\tState:%s\n", memstatus[entry].pid, memstatus[entry].priority, memstatus[entry].user_data, memstatus[entry].limit, state_to_text(memstatus[entry].state));
entry++;

}
}複製程式碼

然後通過 MonekyDev 提供的 
Command-line Tool 工具將程式注入到越獄裝置(當時的測試環境為5s、iOS 9.1)中去,通過 SSH 連線到裝置,然後通過終端執行該程式。就可以得到 dump 的資訊。如下所示:

PID:  9967 Priority: 3 User Data: 0 Limit: 6 State:0x38 Tracked,IdleExit,DirtyPID: 11151 Priority: 3 User Data: 0 Limit: 6 State:0x38 Tracked,IdleExit,DirtyPID: 11154 Priority: 3 User Data: 0 Limit:10 State:0x38 Tracked,IdleExit,DirtyPID: 11165 Priority: 3 User Data: 0 Limit: 6 State:0x38 Tracked,IdleExit,DirtyPID: 11499 Priority: 3 User Data: 0 Limit:18 State:0x28 Tracked,DirtyPID: 10039 Priority: 4 User Data: 2100 Limit:108 State:0x00PID:  9981 Priority: 7 User Data: 0 Limit:10 State:0x08 TrackedPID:  9977 Priority: 7 User Data: 0 Limit:20 State:0x08 TrackedPID:  9979 Priority: 7 User Data: 0 Limit:25 State:0x38 Tracked,IdleExit,DirtyPID: 10021 Priority: 7 User Data: 0 Limit: 6 State:0x08 TrackedPID: 11575 Priority:10 User Data: 10100 Limit:650 State:0x00PID:   103 Priority:11 User Data: 0 Limit:96 State:0x08 TrackedPID: 11442 Priority:11 User Data: 0 Limit:38 State:0x08 TrackedPID:    67 Priority:12 User Data: 0 Limit:24 State:0x28 Tracked,DirtyPID:    31 Priority:14 User Data: 0 Limit:650 State:0x08 TrackedPID:    45 Priority:14 User Data: 0 Limit: 9 State:0x08 Tracked複製程式碼

以上程式碼中,Priority:10 的程式就是我測試的 好好學習 App,此時 App 在前臺並且活躍,所以優先順序是 10,並且得到 oom 記憶體閾值是 650

驗證方案 2 :

當我們的 App 由於 jetsam 被殺死的時候,在手機中會有系統日誌,從手機設定-隱私-分析這條操作路徑中,可以拿到JetsamEvent 開頭的日誌。這些日誌中就可以獲取一些關於 App 的記憶體資訊,以我的 6s 為例,pageSize * 
rpages 的值獲取的值便是閾值,同時日誌中也表明原因是  
“reason” : “per-process-limit” (並不是所有的  
JetsamEvent 中都可以拿到準確的閾值,有的存在偏差。。。)

"pageSize" : 16384{ 
"uuid" : "b8d6682c-5903-3007-b9c2-561d1e6ca9d5", "states" : [ "frontmost", "resume" ], "killDelta" : 18859, "genCount" : 0, "age" : 1775369503, "purgeable" : 0, "fds" : 50, "coalition" : 691, "rpages" : 89600, "reason" : "per-process-limit", "pid" : 960, "cpuTime" : 1.6920809999999999, "name" : "MemoryLimitTest", "lifetimeMax" : 34182
}複製程式碼

驗證方案 3:

可以通過大量的測試來尋找它的oom 記憶體閾值是多少,StackOverFlow 上已經存在一個清單,該清單列舉了一些常見裝置的 oom 閾值。該清單閾值和真實閾值存在偏差,我猜測原有有二,第一,它取記憶體的時機不可能完全和 oom 時機吻合,只能儘可能接近這個時機,第二,他取記憶體的方法和 XNU 中 jetsam 機制所用的記憶體獲取方式不一致。正確獲取記憶體的方式下面會闡述。

Results of testing with the utility Split wrote (link is in his answer):device: (crash amount/total amount/percentage of total)iPad1: 127MB/256MB/49%iPad2: 275MB/512MB/53%iPad3: 645MB/1024MB/62%iPad4: 585MB/1024MB/57% (iOS 8.1)iPad Mini 1st Generation: 297MB/512MB/58%iPad Mini retina: 696MB/1024MB/68% (iOS 7.1)iPad Air: 697MB/1024MB/68%iPad Air 2: 1383MB/2048MB/68% (iOS 10.2.1)iPad Pro 9.7": 1395MB/1971MB/71% (iOS 10.0.2 (14A456))iPad Pro 10.5”: 3057/4000/76% (iOS 11 beta4) iPad Pro 12.9” (2015): 3058/3999/76% (iOS 11.2.1)iPad Pro 12.9” (2017): 3057/3974/77% (iOS 11 beta4)iPod touch 4th gen: 130MB/256MB/51% (iOS 6.1.1)iPod touch 5th gen: 286MB/512MB/56% (iOS 7.0)iPhone4: 325MB/512MB/63%iPhone4s: 286MB/512MB/56%iPhone5: 645MB/1024MB/62%iPhone5s: 646MB/1024MB/63%iPhone6: 645MB/1024MB/62% (iOS 8.x)iPhone6+: 645MB/1024MB/62% (iOS 8.x)iPhone6s: 1396MB/2048MB/68% (iOS 9.2)iPhone6s+: 1392MB/2048MB/68% (iOS 10.2.1)iPhoneSE: 1395MB/2048MB/69% (iOS 9.3)iPhone7: 1395/2048MB/68% (iOS 10.2)iPhone7+: 2040MB/3072MB/66% (iOS 10.2.1)iPhone X: 1392/2785/50% (iOS 11.2.1)https://stackoverflow.com/questions/5887248/ios-app-maximum-memory-budget/15200855#15200855複製程式碼

如何正確度量 App 的使用記憶體

常見的獲取 App 記憶體的方式是使用 
resident_size 程式碼如下:

#import <
mach/mach.h>
- (int64_t)memoryUsage {
int64_t memoryUsageInByte = 0;
struct task_basic_info taskBasicInfo;
mach_msg_type_number_t size = sizeof(taskBasicInfo);
kern_return_t kernelReturn = task_info(mach_task_self(), TASK_BASIC_INFO, (task_info_t) &
taskBasicInfo, &
size);
if(kernelReturn == KERN_SUCCESS) {
memoryUsageInByte = (int64_t) taskBasicInfo.resident_size;
NSLog(@"Memory in use (in bytes): %lld", memoryUsageInByte);

} else {
NSLog(@"Error with task_info(): %s", mach_error_string(kernelReturn));

} return memoryUsageInByte;

}複製程式碼

而正確的方式應該是使用 
phys_footprint,因為 Apple 就是用的這個指標,和 Apple 保持一致才能說明問題。可以看原始碼驗證一下:opensource.apple.com/source/xnu/…

#import <
mach/mach.h>
- (int64_t)memoryUsage {
int64_t memoryUsageInByte = 0;
task_vm_info_data_t vmInfo;
mach_msg_type_number_t count = TASK_VM_INFO_COUNT;
kern_return_t kernelReturn = task_info(mach_task_self(), TASK_VM_INFO, (task_info_t) &
vmInfo, &
count);
if(kernelReturn == KERN_SUCCESS) {
memoryUsageInByte = (int64_t) vmInfo.phys_footprint;
NSLog(@"Memory in use (in bytes): %lld", memoryUsageInByte);

} else {
NSLog(@"Error with task_info(): %s", mach_error_string(kernelReturn));

} return memoryUsageInByte;

}複製程式碼

oom 定位的方案

方案1:

最早看到 oom 相關的方案是 FaceBook 的一篇部落格中講到的,code.facebook.com/posts/11469…,通過排除法來統計 OOM 率是多少。當然這種方案統計的結果多少會與實際資料存在誤差,比如 
ApplicationState 不準確,watchdog 也被統計在 oom 中之類的。

iOS Out-Of-Memory 原理闡述及方案調研

方案2:

近期騰訊也開源了自己的 OOM 定位方案,OOMDetector 元件:github.com/Tencent/OOM… 
。這種方案通過利用 
libmalloc 中的 malloc_logger 函式指標,可以通過堆疊來幫助開發定位大記憶體。但是也存在一些缺陷,就是頻繁的 dump 堆疊對 App 效能造成了影響,只能灰度一小部分使用者來進行資料統計和定位。

方案3:

基於近期的發現,可以線上下獲取 App 的 
high water mark,也就是 oom 記憶體閾值。 那麼就產生了方案3

  • 監控記憶體增長,在達到 
     
    high water mark 附近的時候,dump 記憶體資訊,獲取物件名稱、物件個數、各物件的記憶體值;如果穩定可以全量開啟,不會有效能問題
  • OOMDetector 可以拿到分配記憶體的堆疊,對於定位到程式碼層面更加有效;可以灰度開放

來源:https://juejin.im/post/5c28646f5188257abf1d947d

相關文章