iOS APP啟動-Main函式之前的那些事兒

LeeWong0632發表於2020-11-28

原文網址 : https://blog.csdn.net/weixin_43180925/article/details/110261166

在上一篇文章中我們介紹了應用啟動在objc_init方法執行前的呼叫堆疊，根據這個堆疊我們可以看出在main函式之前實際上系統核心以及dyld還做了很多的操作，那麼這篇文章我們來詳細的看一下在這個過程中到底做了哪些事情。

我們在來看下這這張圖：

從上圖中我們看一看到應用啟動的入口實際是_dyld_start函式,我們從XNU原始碼dyldStartup.s中找到了這個方法：

__dyld_start

__dyld_start:
	//..........省略掉彙編程式碼
	// call dyldbootstrap::start(app_mh, argc, argv, slide, dyld_mh, &startGlue)
	bl	__ZN13dyldbootstrap5startEPK12macho_headeriPPKclS2_Pm
  //..........省略掉彙編程式碼

__dyld_start是一個彙編方法(看不懂?),不過我們也可以看出這個方法裡實際上是呼叫了dyldbootstrap::start方法，恰好也驗證了我們截圖中的呼叫堆疊。

dyldbootstrap::start

dyldbootstrap::start(...), 首先bootstrapping dyld, 然後呼叫dyld::_main核心方法

// appsMachHeader 即mach-o檔案的header欄位
// argc 即 argument count 即程式執行的引數個數
// argv[] 即 argument value 是一個字串陣列 用來存放指向你的字串引數的指標陣列，每一個元素指向一個引數
// slide 偏移量
uintptr_t start(const struct macho_header* appsMachHeader, int argc, const char* argv[], 
				intptr_t slide, const struct macho_header* dyldsMachHeader,
				uintptr_t* startGlue)
{
	// if kernel had to slide dyld, we need to fix up load sensitive locations
	// we have to do this before using any global variables
	// 如果slide dyld, 我們必須 fixeup dyly中的內容
	if ( slide != 0 ) {
		// 重新設定dyld
		rebaseDyld(dyldsMachHeader, slide);
	}

	// allow dyld to use mach messaging
	// 允許dyld使用mach訊息傳遞
	mach_init();

	// kernel sets up env pointer to be just past end of agv array
	// 核心設定的env pointers, 也就是環境引數
	// envp = environment pointer
	// 取出argv的第argc條資料 但是實際上argv 只有argc個引數
	// 因此 envp 預設是緊挨著argv儲存的
	const char** envp = &argv[argc+1];
	
	// kernel sets up apple pointer to be just past end of envp array
	// kernel將apple指標設定為剛好超出envp陣列的末尾
	const char** apple = envp;
	while(*apple != NULL) { ++apple; }
	++apple;

	// set up random value for stack canary
	// 棧溢位保護
	__guard_setup(apple);

#if DYLD_INITIALIZER_SUPPORT
	// run all C++ initializers inside dyld
	// 在dyld中執行所有C初始化程式
	runDyldInitializers(dyldsMachHeader, slide, argc, argv, envp, apple);
#endif

	// now that we are done bootstrapping dyld, call dyld's main
	// 呼叫dyld的main
	uintptr_t appsSlide = slideOfMainExecutable(appsMachHeader);
	return dyld::_main(appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}

這裡附上start方法的重要引數macho_header的結構體：

struct mach_header_64 {
    uint32_t    magic;      /* 區分系統架構版本 */
    cpu_type_t  cputype;    /*CPU型別 */
    cpu_subtype_t   cpusubtype; /* CPU具體型別 */
    uint32_t    filetype;   /* 檔案型別 */
    uint32_t    ncmds;      /* loadcommands 條數，即依賴庫數量*/
    uint32_t    sizeofcmds; /* 依賴庫大小 */
    uint32_t    flags;      /* 標誌位 */
    uint32_t    reserved;   /* 保留欄位，暫沒有用到*/
};

start方法的主要作用就是：先讀取Mach-O檔案的頭部資訊，設定虛擬地址偏移，這裡的偏移主要用於重定向。接下來就是初始化Mach-O檔案，用於後續載入庫檔案和DATA資料，再執行C++的初始化器，最後進入dyly的主函式。

dyld::_main

// dyld的main函式 dyld的入口方法kernel載入dyld並設定設定一些暫存器並呼叫此函式,之後跳轉到__dyld_start
// mainExecutableSlide 主程式的slider,用於做重定向 會在main方法中被賦值
// mainExecutableMH 主程式MachO的header
// argc 表示main函式引數個數
// argv 表示main函式的引數值 argv[argc] 可以獲取到引數值
// envp[] 表示以設定好的環境變數
// apple 是從envp開始獲取到第一個值為NULL的指標地址
uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide, 
		int argc, const char* argv[], const char* envp[], const char* apple[], 
		uintptr_t* startGlue)
{
	uintptr_t result = 0;
	// 1 設定執行環境，處理環境變數 mainExecutableMH為macho_header型別
	// 表示的是當前主程式的Mach-O頭部資訊, 有了頭部資訊, 載入器就可以從頭開始, 遍歷整個Mach-O檔案的資訊
	sMainExecutableMachHeader = mainExecutableMH;

	CRSetCrashLogMessage("dyld: launch started");

	// 設定上下文  包括一些回撥函式, 引數與標誌設定資訊
	setContext(mainExecutableMH, argc, argv, envp, apple);

	// Pickup the pointer to the exec path.
	// 獲取指向exec路徑的指標 執行exec相關指令 apple是一個陣列 所以apple表示陣列首元素的地址
	// _simple_getenv 方法可以理解為從apple中獲取"executable_path"對應的值
	sExecPath = _simple_getenv(apple, "executable_path");

	// <rdar://problem/13868260> Remove interim apple[0] transition code from dyld
	if (!sExecPath) sExecPath = apple[0];

	// 將可執行檔案的路徑由相對路徑轉化成絕對路徑
	bool ignoreEnvironmentVariables = false;
	// 判斷是否是相對路徑的條件
	if ( sExecPath[0] != '/' ) {
		// have relative path, use cwd to make absolute
		// 相對路徑-->絕對路徑
		char cwdbuff[MAXPATHLEN];
		//
	    if ( getcwd(cwdbuff, MAXPATHLEN) != NULL ) {
			// maybe use static buffer to avoid calling malloc so early...
			char* s = new char[strlen(cwdbuff) + strlen(sExecPath) + 2];
			strcpy(s, cwdbuff);
			strcat(s, "/");
			strcat(s, sExecPath);
			sExecPath = s;
		}
	}
	// Remember short name of process for later logging
	//  獲取可執行檔案去除前面的路徑, 獲取它的name
	// strrchr:在引數 sExecPath 所指向的字串中搜尋最後一次出現字元 '/'的位置
	sExecShortName = ::strrchr(sExecPath, '/');
	// 如果獲取到了檔名的位置
	if ( sExecShortName != NULL )
		// 檔名真正的起始位置
		++sExecShortName;
	else
		// 檔名起始位置就是絕對路徑
		sExecShortName = sExecPath;

	// 配置程式是否受到限制
    sProcessIsRestricted = processRestricted(mainExecutableMH, &ignoreEnvironmentVariables, &sProcessRequiresLibraryValidation);
	// 如果程式受限
    if ( sProcessIsRestricted ) {
#if SUPPORT_LC_DYLD_ENVIRONMENT
		// 檢查載入命令環境變數
		// 遍歷Mach-O中所有的LC_DYLD_ENVIRONMENT載入命令, 然後呼叫processDyldEnvironmentVariable()對不同的環境變數做相應的處理
		checkLoadCommandEnvironmentVariables();
#endif
		// 刪除程式的LD_LIBRARY_PATH與所有以DYLD_開頭的環境變數, 這樣以後建立的子程式就不包含這些環境變數了
		pruneEnvironmentVariables(envp, &apple);
		// set again because envp and apple may have changed or moved
		// 重新設定連結上下文。這一步執行的主要目的是由於環境變數發生變化了, 需要更新程式的envp與apple引數
		setContext(mainExecutableMH, argc, argv, envp, apple);
	}
	else {
		if ( !ignoreEnvironmentVariables )
			// 檢查環境變數
			checkEnvironmentVariables(envp);
		defaultUninitializedFallbackPaths(envp);
	}

	// 列印資訊 不需要關注
	if ( sEnv.DYLD_PRINT_OPTS )
		printOptions(argv);
	if ( sEnv.DYLD_PRINT_ENV ) 
		printEnvironmentVariables(envp);

	// 獲取當前裝置的CPU架構資訊
	getHostInfo(mainExecutableMH, mainExecutableSlide);

	// install gdb notifier
	// 註冊gdb的監聽者, 用於除錯
	stateToHandlers(dyld_image_state_dependents_mapped, sBatchHandlers)->push_back(notifyGDB);
	stateToHandlers(dyld_image_state_mapped, sSingleHandlers)->push_back(updateAllImages);
	// make initial allocations large enough that it is unlikely to need to be re-alloced
	sAllImages.reserve(INITIAL_IMAGE_COUNT);
	sImageRoots.reserve(16);
	sAddImageCallbacks.reserve(4);
	sRemoveImageCallbacks.reserve(4);
	sImageFilesNeedingTermination.reserve(16);
	sImageFilesNeedingDOFUnregistration.reserve(8);
	
	
#ifdef WAIT_FOR_SYSTEM_ORDER_HANDSHAKE
	// <rdar://problem/6849505> Add gating mechanism to dyld support system order file generation process
	WAIT_FOR_SYSTEM_ORDER_HANDSHAKE(dyld::gProcessInfo->systemOrderFlag);
#endif
	
	//2 初始化主程式
	try {
		// add dyld itself to UUID list
		// 將dyld新增到UUIDlist中
		addDyldImageToUUIDList();

		CRSetCrashLogMessage(sLoadingCrashMessage);
		// instantiate ImageLoader for main executable
		//  載入sExecPath路徑下的可執行檔案, 例項化一個ImageLoader物件
		sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
		// 設定上下文, 將MainExecutable 這個 ImageLoader設定給連結上下文, 配置連結上下文其他變數
		gLinkContext.mainExecutable = sMainExecutable;
		gLinkContext.processIsRestricted = sProcessIsRestricted;
		gLinkContext.processRequiresLibraryValidation = sProcessRequiresLibraryValidation;
		gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);

		// load shared cache
		// 3 載入共享快取
		checkSharedRegionDisable();
	#if DYLD_SHARED_CACHE_SUPPORT
		if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion )
			// 對映共享快取
			mapSharedCache();
	#endif

		// Now that shared cache is loaded, setup an versioned dylib overrides
	#if SUPPORT_VERSIONED_PATHS
		checkVersionedPaths();
	#endif

		// load any inserted libraries
		// 4 載入插入的動態庫
		// 變數 `DYLD_INSERT_LIBRARIES` 環境變數, 呼叫`loadInsertedDylib`方法載入所有要插入的庫,
		// 這些庫都被加入到`sAllImages`陣列中
		if	( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
			for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
				loadInsertedDylib(*lib);
		}
		// record count of inserted libraries so that a flat search will look at 
		// inserted libraries, then main, then others.
		// 記錄插入的庫的數量，以便進行統一搜尋插入的庫，然後是main，然後是其他
		sInsertedDylibCount = sAllImages.size()-1;

		// link main executable
		// 5 連結主程式
		// 開始連結主程式, 此時主程式已經被載入到gLinkContext.mainExecutable中,
		// 呼叫 link 連結主程式。核心呼叫的是ImageLoader::link 函式。
		gLinkContext.linkingMainExecutable = true;
		// link方法
		link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL));
		// 設定永不遞迴解除安裝
		sMainExecutable->setNeverUnloadRecursive();
		// mach-o header中的MH_FORCE_FLAT
		if ( sMainExecutable->forceFlat() ) {
			gLinkContext.bindFlat = true;
			gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
		}

		// link any inserted libraries
		// 6 連結插入的動態庫
		// do this after linking main executable so that any dylibs pulled in by inserted 
		// dylibs (e.g. libSystem) will not be in front of dylibs the program uses
		// 對 sAllimages （除了主程式的Image外）中的庫呼叫link進行連結，
		// 然後呼叫 registerInterposing 註冊符號插入, 例如是libSystem就是此時加入的
		if ( sInsertedDylibCount > 0 ) {
			for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
				// link
				link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL));
				image->setNeverUnloadRecursive();
			}
			// only INSERTED libraries can interpose
			// register interposing info after all inserted libraries are bound so chaining works
			for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
				// 註冊符號插入,Interposition, 是通過編寫與函式庫同名的函式來取代函式庫的行為.
				image->registerInterposing();
			}
		}

		// <rdar://problem/19315404> dyld should support interposition even without DYLD_INSERT_LIBRARIES
		for (int i=sInsertedDylibCount+1; i < sAllImages.size(); ++i) {
			ImageLoader* image = sAllImages[i];
			if ( image->inSharedCache() )
				continue;
			image->registerInterposing();
		}

		// apply interposing to initial set of images
		for(int i=0; i < sImageRoots.size(); ++i) {
			sImageRoots[i]->applyInterposing(gLinkContext);
		}
		gLinkContext.linkingMainExecutable = false;
		
		// <rdar://problem/12186933> do weak binding only after all inserted images linked
		// 7 執行弱符號繫結
		sMainExecutable->weakBind(gLinkContext);
		
		CRSetCrashLogMessage("dyld: launch, running initializers");
	#if SUPPORT_OLD_CRT_INITIALIZATION
		// Old way is to run initializers via a callback from crt1.o
		if ( ! gRunInitializersOldWay ) 
			initializeMainExecutable(); 
	#else
		// run all initializers
		// 8 執行初始化方法
		// 執行初始化方法, 其中`+load` 和constructor方法就是在這裡執行,
		// `initializeMainExecutable`方法先是內部呼叫動態庫的初始化方法, 然後呼叫主程式的初始化方法
		initializeMainExecutable(); 
	#endif
		// find entry point for main executable
		// 9 查詢APP入口點並返回
		result = (uintptr_t)sMainExecutable->getThreadPC();
		if ( result != 0 ) {
			// main executable uses LC_MAIN, needs to return to glue in libdyld.dylib
			if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
				*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
			else
				halt("libdyld.dylib support not present for LC_MAIN");
		}
		else {
			// main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
			result = (uintptr_t)sMainExecutable->getMain();
			*startGlue = 0;
		}
	}
	catch(const char* message) {
		syncAllImages();
		halt(message);
	}
	catch(...) {
		dyld::log("dyld: launch failed\n");
	}

	CRSetCrashLogMessage(NULL);
	
	return result;
}

下面我們對main函式進行拆分講解

1 設定執行環境，處理環境變數

這一步我們主要關注sExecPath,processRestricted,getHostInfo這幾個方法。

sExecPath

sExecPath = _simple_getenv(apple, "executable_path");

我們在介紹引數的時候介紹到 apple 實際上儲存這應用的環境變數的陣列，executable_path就表示執行路徑，而_simple_getenv方法就是從apple中獲取executable_path對應的值。不過這裡獲取到的可能是一個相對路徑，而dyld判斷是否為相對路徑的條件：

if ( sExecPath[0] != '/' ) {
    // 相對路徑
    if ( getcwd(cwdbuff, MAXPATHLEN) != NULL ) {
			// maybe use static buffer to avoid calling malloc so early...
			char* s = new char[strlen(cwdbuff) + strlen(sExecPath) + 2];
			// 拷貝
			strcpy(s, cwdbuff);
			// 拼接
			strcat(s, "/");
			strcat(s, sExecPath);
			// 重新賦值
			sExecPath = s;
		}
}

這樣我們就可以獲取到執行檔案的絕對路徑。在獲取到絕對路徑後，我們可以根據絕對路徑獲取到執行檔案的檔名：

sExecShortName = ::strrchr(sExecPath, '/');

strrchr方法的功能為:在引數 sExecPath 所指向的字串中搜尋最後一次出現字元 '/'的位置。

processRestricted

程式是否受限，這裡我們主要關注下Mach-O相關的一個判斷：

// 程式受限
static bool processRestricted(const macho_header* mainExecutableMH, bool* ignoreEnvVars, bool* processRequiresLibraryValidation)
{			
	// <rdar://problem/13158444&13245742> Respect __RESTRICT,__restrict section for root processes
	// 段名受限。當Mach-O包含一個__RESTRICT/__restrict段時，程式會被設定成受限
	if ( hasRestrictedSegment(mainExecutableMH) ) {
		// existence of __RESTRICT/__restrict section make process restricted
		sRestrictedReason = restrictedBySegment;
		return true;
	}
    return false;
}

hasRestrictedSegment方法的實現如下：

//dyld::log("seg name: %s\n", seg->segname);
				if (strcmp(seg->segname, "__RESTRICT") == 0) {
					const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command));
					const struct macho_section* const sectionsEnd = &sectionsStart[seg->nsects];
					for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
						if (strcmp(sect->sectname, "__restrict") == 0) 
							return true;
					}
				}

實際上是從Mach-O檔案中依次讀取所有的segment，判斷segment->segname是否包含__RESTRICT字串來判斷是否受限。

getHostInfo

getHostInfo是用來獲取當前裝置的CPU架構資訊。

我們來簡單看下這個方法的實現：

static void getHostInfo(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide)
{
#if CPU_SUBTYPES_SUPPORTED
#if __ARM_ARCH_7K__
	sHostCPU		= CPU_TYPE_ARM;
	sHostCPUsubtype = CPU_SUBTYPE_ARM_V7K;
#elif __ARM_ARCH_7A__
	sHostCPU		= CPU_TYPE_ARM;
	sHostCPUsubtype = CPU_SUBTYPE_ARM_V7;
#elif __ARM_ARCH_6K__
	sHostCPU		= CPU_TYPE_ARM;
	sHostCPUsubtype = CPU_SUBTYPE_ARM_V6;
#elif __ARM_ARCH_7F__
	sHostCPU		= CPU_TYPE_ARM;
	sHostCPUsubtype = CPU_SUBTYPE_ARM_V7F;
#elif __ARM_ARCH_7S__
	sHostCPU		= CPU_TYPE_ARM;
	sHostCPUsubtype = CPU_SUBTYPE_ARM_V7S;
#else
	struct host_basic_info info;
	mach_msg_type_number_t count = HOST_BASIC_INFO_COUNT;
	mach_port_t hostPort = mach_host_self();
	kern_return_t result = host_info(hostPort, HOST_BASIC_INFO, (host_info_t)&info, &count);
	if ( result != KERN_SUCCESS )
		throw "host_info() failed";
	sHostCPU		= info.cpu_type;
	sHostCPUsubtype = info.cpu_subtype;
	mach_port_deallocate(mach_task_self(), hostPort);
  #if __x86_64__
	#if TARGET_IPHONE_SIMULATOR
	  sHaswell = false;
	#else
	  sHaswell = (sHostCPUsubtype == CPU_SUBTYPE_X86_64_H);
	  // <rdar://problem/18528074> x86_64h: Fall back to the x86_64 slice if an app requires GC.
	  if ( sHaswell ) {
		if ( isGCProgram(mainExecutableMH, mainExecutableSlide) ) {
			// When running a GC program on a haswell machine, don't use and 'h slices
			sHostCPUsubtype = CPU_SUBTYPE_X86_64_ALL;
			sHaswell = false;
			gLinkContext.sharedRegionMode = ImageLoader::kDontUseSharedRegion;
		}
	  }
	#endif
  #endif
#endif
#endif
}

設定環境變數完成且獲取了CPU資訊後，dyld就開始準備初始化主程式了，下面我們看下main函式的下一步初始化主程式。

2 初始化主程式

初始化主程式主要做了兩件事：

將dyld新增到UUIDlist中
載入可執行檔案例項化ImageLoader物件

下面我們來詳細看下這兩步分別都做了什麼

addDyldImageToUUIDList

addDyldImageToUUIDList方法是載入DYLD到UUID list中,我們來看下這個方法的實現:

// <rdar://problem/10583252> Add dyld to uuidArray to enable symbolication of stackshots
// 將dyld新增到uuidArray以啟用符號堆疊
static void addDyldImageToUUIDList()
{
	const struct macho_header* mh = (macho_header*)&__dso_handle;
	const uint32_t cmd_count = mh->ncmds;
	const struct load_command* const cmds = (struct load_command*)((char*)mh + sizeof(macho_header));
	const struct load_command* cmd = cmds;
	for (uint32_t i = 0; i < cmd_count; ++i) {
		switch (cmd->cmd) {
			case LC_UUID: {
				uuid_command* uc = (uuid_command*)cmd;
				// 新建一個dyld_uuid_info
				dyld_uuid_info info;
				// 給新建的info imageLoadAddress 欄位賦值
				info.imageLoadAddress = (mach_header*)mh;
				// 複製uc->uuid的16個位元組給info.imageUUID
				memcpy(info.imageUUID, uc->uuid, 16);
				// 利用組裝好的info給dyld的gProcessInfo的uuidArray和uuidArrayCount賦值
				addNonSharedCacheImageUUID(info);
				return;
			}
		}
		cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
	}
}

從程式碼中我們可以看出，這個方法是遍歷了Mach-O中的load_command並將cmd->cmd值為LC_UUID的新增到dyld::gProcessInfo->uuidArray中並更新個數。

我們可以通過addNonSharedCacheImageUUID的實現進一步確認：

// 將info中的uuidArray新增到dyld::gProcessInfo中
void addNonSharedCacheImageUUID(const dyld_uuid_info& info)
{
	// set uuidArray to NULL to denote it is in-use
	// 將uuidArray設定為NULL 表示這個欄位正在使用中
	dyld::gProcessInfo->uuidArray = NULL;
	
	// append all new images
	// 追加外部傳入的info到sImageUUIDs中
	sImageUUIDs.push_back(info);
	// 重新設定追加後uuidArrayCount
	dyld::gProcessInfo->uuidArrayCount = sImageUUIDs.size();
	
	// set uuidArray back to base address of vector (other process can now read)
	// 更新追加後的uuidArray
	dyld::gProcessInfo->uuidArray = &sImageUUIDs[0];
}

instantiateFromLoadedImage

從方法名中我們就可以看到這個方法是例項化一個ImageLoader,下面我們來詳細瞭解下這個方法：

// mh 即 Mach-O檔案的header
// slide 表示偏移量
// path 表示可執行檔案地址
static ImageLoader* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
	// try mach-o loader
	// 檢查mach-o的subtype是否是當前cpu可以支援
	if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
		// 根據傳入的引數例項化一個ImageLoaderMachO型別的ImageLoader
		ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
		// 將主程式新增到全域性主列表sAllImages中, 
		// 最後呼叫addMappedRange()申請記憶體, 更新主程式映像對映的記憶體區
		addImage(image);
		return image;
	}
	
	throw "main executable not a known format";
}

例項化一個ImageLoaderMachO後，我們將第一步獲取到的一些變數設定給我們剛建立的ImageLoader：

gLinkContext.mainExecutable = sMainExecutable;
		gLinkContext.processIsRestricted = sProcessIsRestricted;
		gLinkContext.processRequiresLibraryValidation = sProcessRequiresLibraryValidation;
		gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);

3 載入共享快取

何為共享快取，比如我們都知道iOS開發中會依賴系統的UIKit以及Foundation庫，那麼iOS系統中安裝很多應用每個應用都要有自己獨立載入UIKit嗎？當然不是，所有的App會共用一份UIKit庫，而這份UIKit庫就存放在共享快取中。

這一步我們重點關注:checkSharedRegionDisable,mapSharedCache,checkVersionedPaths這幾個方法：

checkSharedRegionDisable

static void checkSharedRegionDisable()
{
	// iPhoneOS cannot run without shared region
}

這個方法中包含了一些Mac OS的判斷不過在方法的最後，系統的註釋: iOS如果沒有共享庫將無法執行。所以這個方法我們也不需要多做解讀

mapSharedCache

static void mapSharedCache() {
	// 快速檢查快取是否已經被載入到共享快取中了  如果沒有返回-1
	if ( _shared_region_check_np(&cacheBaseAddress) == 0 ) {
		if ( (header->mappingOffset >= 0x48) && (header->slideInfoSize != 0) ) {
			// solve for slide by comparing loaded address to address of first region
			// 通過比較載入的地址和第一個區域的地址來解決偏移問題
			const uint8_t* loadedAddress = (uint8_t*)sSharedCache;
			const dyld_cache_mapping_info* const mappings = (dyld_cache_mapping_info*)(loadedAddress+header->mappingOffset);
			const uint8_t* preferedLoadAddress = (uint8_t*)(long)(mappings[0].address);
			//載入的地址 - 第一個區域的地址
			// 更新偏移量
			sSharedCacheSlide = loadedAddress - preferedLoadAddress;
			dyld::gProcessInfo->sharedCacheSlide = sSharedCacheSlide;
		}
		// if cache has a uuid, copy it
		// 更新UUID
		if ( header->mappingOffset >= 0x68 ) {
			memcpy(dyld::gProcessInfo->sharedCacheUUID, header->uuid, 16);
		}
	} else {
		if ( (sysctlbyname("kern.safeboot", &safeBootValue, &safeBootValueSize, NULL, 0) == 0) && (safeBootValue != 0) ) {
			// 安全模式下
			::unlink(MACOSX_DYLD_SHARED_CACHE_DIR DYLD_SHARED_CACHE_BASE_NAME ARCH_NAME);
			// 設定sharedRegionMode = kDontUseSharedRegion
			gLinkContext.sharedRegionMode = ImageLoader::kDontUseSharedRegion;
			return;
		} else {
			// map in shared cache to shared region
			int fd = openSharedCacheFile();
			if ( fd != -1 ) {
				uint8_t firstPages[8192];
				if ( ::read(fd, firstPages, 8192) == 8192 ) {
					dyld_cache_header* header = (dyld_cache_header*)firstPages;
					for (const dyld_cache_mapping_info* p = fileMappingsStart; p < fileMappingsEnd; ++p, ++i) {
						mappings[i].sfm_address		= p->address;
						mappings[i].sfm_size		= p->size;
						mappings[i].sfm_file_offset	= p->fileOffset;
						mappings[i].sfm_max_prot	= p->maxProt;
						mappings[i].sfm_init_prot	= p->initProt;
						// rdar://problem/5694507 old update_dyld_shared_cache tool could make a cache file
						// that is not page aligned, but otherwise ok.
						if ( p->fileOffset+p->size > (uint64_t)(stat_buf.st_size+4095 & (-4096)) ) {
							dyld::log("dyld: shared cached file is corrupt: %s" DYLD_SHARED_CACHE_BASE_NAME ARCH_NAME "\n", sSharedCacheDir);
							goodCache = false;
						}
						if ( (mappings[i].sfm_init_prot & (VM_PROT_READ|VM_PROT_WRITE)) == (VM_PROT_READ|VM_PROT_WRITE) ) {
							readWriteMappingIndex = i;
						}
						if ( mappings[i].sfm_init_prot == VM_PROT_READ ) {
							readOnlyMappingIndex = i;
						}
						if ( gLinkContext.verboseMapping ) {
							dyld::log("dyld: calling _shared_region_map_and_slide_np() with regions:\n");
							for (int i=0; i < mappingCount; ++i) {
								dyld::log("   address=0x%08llX, size=0x%08llX, fileOffset=0x%08llX\n", mappings[i].sfm_address, mappings[i].sfm_size, mappings[i].sfm_file_offset);
							}
						}
						if (_shared_region_map_and_slide_np(fd, mappingCount, mappings, codeSignatureMappingIndex, cacheSlide, slideInfo, slideInfoSize) == 0) {
							// successfully mapped cache into shared region
							sSharedCache = (dyld_cache_header*)mappings[0].sfm_address;
							sSharedCacheSlide = cacheSlide;
							dyld::gProcessInfo->sharedCacheSlide = cacheSlide;
							//dyld::log("sSharedCache=%p sSharedCacheSlide=0x%08lX\n", sSharedCache, sSharedCacheSlide);
							// if cache has a uuid, copy it
							if ( header->mappingOffset >= 0x68 ) {
								memcpy(dyld::gProcessInfo->sharedCacheUUID, header->uuid, 16);
							}
						}
					}
				}
			}

		}

	}
}

這一步先通過mapSharedCache()方法來對映共享快取, 該函式先通過_shared_region_check_np()來檢查快取是否已經對映到了共享區域了, 如果已經對映了, 就更新快取的slide與UUID, 然後返回；
如果有沒有對映判斷系統是否處於安全啟動模式（safe-boot mode）下,如果是就刪除快取檔案並返回, 如果非安全啟動模式, 接下來呼叫openSharedCacheFile()開啟快取檔案, 該函式在sSharedCacheDir路徑下, 開啟與系統當前cpu架構匹配的快取檔案，也就是/var/db/dyld/dyld_shared_cache_x86_64h, 接著讀取快取檔案的前8192位元組, 解析快取頭dyld_cache_header的資訊, 將解析好的快取資訊存入mappings變數, 最後呼叫_shared_region_map_and_slide_np()完成真正的對映工作。

checkVersionedPaths

static void checkVersionedPaths()
{
	// search DYLD_VERSIONED_LIBRARY_PATH directories for dylibs and check if they are newer
	// 讀取DYLD_VERSIONED_LIBRARY_PATH環境變數 
	if ( sEnv.DYLD_VERSIONED_LIBRARY_PATH != NULL ) {
		for(const char* const* lp = sEnv.DYLD_VERSIONED_LIBRARY_PATH; *lp != NULL; ++lp) {
			// 判斷是否需要覆蓋當前目錄下的庫
			checkDylibOverridesInDir(*lp);
		}
	}
	// 讀取DYLD_VERSIONED_FRAMEWORK_PATH環境變數
	// search DYLD_VERSIONED_FRAMEWORK_PATH directories for dylibs and check if they are newer
	if ( sEnv.DYLD_VERSIONED_FRAMEWORK_PATH != NULL ) {
		for(const char* const* fp = sEnv.DYLD_VERSIONED_FRAMEWORK_PATH; *fp != NULL; ++fp) {
			// 判斷是否需要覆蓋當前目錄下的庫
			checkFrameworkOverridesInDir(*fp);
		}
	}
}

4、載入插入的動態庫

		// 遍歷 `DYLD_INSERT_LIBRARIES` 環境變數, 呼叫`loadInsertedDylib`方法載入所有要插入的庫,
		// 這些庫都被加入到`sAllImages`陣列中
		if	( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
			for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
			 // 這裡傳入的是每個lib的path
				loadInsertedDylib(*lib);
		}

上面的這段程式碼主要是：遍歷sEnv.DYLD_INSERT_LIBRARIES所有要拆入的庫(地址連續所以使用++獲取地址)，然後呼叫了loadInsertedDylib方法進行載入插入的庫。

下面我們來詳細看下這個方法：

static void loadInsertedDylib(const char* path)
{
	// 建立一個imageloader
	ImageLoader* image = NULL;
	try {
		LoadContext context;
		context.useSearchPaths		= false;
		context.useFallbackPaths	= false;
		context.useLdLibraryPath	= false;
		context.implicitRPath		= false;
		context.matchByInstallName	= false;
		context.dontLoad			= false;
		context.mustBeBundle		= false;
		context.mustBeDylib			= true;
		context.canBePIE			= false;
		context.origin				= NULL;	// can't use @loader_path with DYLD_INSERT_LIBRARIES
		context.rpath				= NULL;
		// 根據外部傳入的path和新建的context構造一個ImageLoader
		image = load(path, context);
	}
}

load方法會先呼叫loadPhase0方法方式從檔案載入，而loadPhase0又會呼叫loadPhase1或loadPhase2去載入，實際上呼叫層次沒加一層都是在對應load方法的path引數後拼接了一層，是不斷的完善path路徑的過程：

載入拆入的庫後，還需要更新sInsertedDylibCount:

		sInsertedDylibCount = sAllImages.size()-1;

這裡的-1操作實際上是排除主程式之外

5、連結主程式

// 開始連結主程式, 此時主程式已經被載入到gLinkContext.mainExecutable中,
		// 呼叫 link 連結主程式。核心呼叫的是ImageLoader::link 函式。
		gLinkContext.linkingMainExecutable = true;
		// link方法
		link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL));
		// 設定永不遞迴解除安裝
		sMainExecutable->setNeverUnloadRecursive();
		// mach-o header中的MH_FORCE_FLAT
		if ( sMainExecutable->forceFlat() ) {
			gLinkContext.bindFlat = true;
			gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
		}

這一步就是將載入進來的二進位制變為可用狀態的過程：rebase => binding

rebase就是針對 “mach-o在載入到記憶體中不是固定的首地址” 這一現象做資料修正的過程。
binding就是將這個二進位制呼叫的外部符號進行繫結的過程。
lazyBinding就是在載入動態庫的時候不會立即binding, 當時當第一次呼叫這個方法的時候再實施binding。

例如我們objc程式碼中需要使用到NSObject, 即符號_OBJC_CLASS_$_NSObject，但是這個符號又不在我們的二進位制中，在系統庫 Foundation.framework中，因此就需要binding這個操作將對應關係繫結到一起。

Link

這一步我們主要是看link方法(簡化版)：

void ImageLoader::link(const LinkContext& context, bool forceLazysBound, bool preflightOnly, bool neverUnload, const RPathChain& loaderRPaths)
{
   // 遞迴載入庫
	this->recursiveLoadLibraries(context, preflightOnly, loaderRPaths);
	context.notifyBatch(dyld_image_state_dependents_mapped);
  // 遞迴rebase
 	this->recursiveRebase(context);
	context.notifyBatch(dyld_image_state_rebased);
  // 遞迴bind
 	this->recursiveBind(context, forceLazysBound, neverUnload);
   
	if ( !context.linkingMainExecutable )
	   // weakBind 
		this->weakBind(context);

	context.notifyBatch(dyld_image_state_bound);

	std::vector<DOFInfo> dofs;
	// 遞迴獲取DOFSection
	this->recursiveGetDOFSections(context, dofs);
	context.registerDOFs(dofs);
}

經過link操作後主程式達到了一個可用的狀態。

6、連結插入的動態庫

在連結主程式後連結插入的動態庫，因此所有插入的動態庫都會在系統使用的動態庫後面。

與連結主程式相同，拆入的動態庫也是通過呼叫link方法進行連結：

link

// sInsertedDylibCount 插入動態庫的個數
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
				// 連結
				link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL));
				// 
				image->setNeverUnloadRecursive();
			}

sInsertedDylibCount表示前期通過呼叫addImage方法插入到sAllImages的動態庫的個數，遍歷每一個拆入的動態庫注意：這裡sAllImages的下標是從1開始的，因為第0個位置存放的是主程式。

registerInterposing

void ImageLoaderMachO::registerInterposing()
{
	// mach-o files advertise interposing by having a __DATA __interpose section
	// 這個方法是要操作 Mach-O檔案的__DATA__區
	const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
	const struct load_command* cmd = cmds;
	for (uint32_t i = 0; i < cmd_count; ++i) {
		switch (cmd->cmd) {
			// 找到load_commands中的LC_SEGMENT_COMMAND
			case LC_SEGMENT_COMMAND:
				{

					for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
						// 查詢__DATA段的__interpose節區
						if ( ((sect->flags & SECTION_TYPE) == S_INTERPOSING) || ((strcmp(sect->sectname, "__interpose") == 0) && (strcmp(seg->segname, "__DATA") == 0)) ) {

							for (size_t i=0; i < count; ++i) {

								// 找到需要應用插入操作(也可以叫作符號地址替換)的資料
								if ( this->containsAddress((void*)tuple.replacement) ) {

									// 將要替換的符號與被替換的符號資訊存入fgInterposingTuples列表中, 供以後具體符號替換時查詢
									ImageLoader::fgInterposingTuples.push_back(tuple);
								}
							}
						}
					}
				}
				break;
		}
		cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
	}
}

registerInterposing()查詢__DATA段的__interpose節區, 找到需要應用插入操作(也可以叫作符號地址替換)的資料, 然後做一些檢查後, 將要替換的符號與被替換的符號資訊存入fgInterposingTuples列表中, 供以後具體符號替換時查詢(applyInterposing中會用到)。

applyInterposing

applyInterposing() -> recursiveApplyInterposing() -> doInterpose() -> eachBind() -> interposeAt()

下面看下interposeAt方法：

uintptr_t ImageLoaderMachOCompressed::interposeAt(const LinkContext& context, uintptr_t addr, uint8_t type, const char*, 
												uint8_t, intptr_t, long, const char*, LastLookup*, bool runResolver)
{
	if ( type == BIND_TYPE_POINTER ) {
		uintptr_t* fixupLocation = (uintptr_t*)addr;
		uintptr_t curValue = *fixupLocation;
		uintptr_t newValue = interposedAddress(context, curValue, this);
		if ( newValue != curValue)
			*fixupLocation = newValue;
	}
	return 0;
}

這個方法的實現很簡單就是對比了新值和舊值如果不同就將對應地址的值改為新值。

7 執行弱符號繫結


void ImageLoader::weakBind(const LinkContext& context)
{
	ImageLoader* imagesNeedingCoalescing[fgImagesRequiringCoalescing];
	// 將sAllImages中所有含有弱符號的映像合併成一個列表
	int count = context.getCoalescedImages(imagesNeedingCoalescing);
	// don't need to do any coalescing if only one image has overrides, or all have already been done
	// 如果進行weakbind的映象個數>0
	if ( (countOfImagesWithWeakDefinitionsNotInSharedCache > 0) && (countNotYetWeakBound > 0) ) {
		// make symbol iterators for each
		ImageLoader::CoalIterator iterators[count];
		ImageLoader::CoalIterator* sortedIts[count];
		for(int i=0; i < count; ++i) {
			// 對映象進行排序
			imagesNeedingCoalescing[i]->initializeCoalIterator(iterators[i], i);
			sortedIts[i] = &iterators[i];
		}

		int doneCount = 0;
		while ( doneCount != count ) {
			// 收集需要進行繫結的弱符號
			// 該函式讀取映像動態連結資訊的weak_bind_off與weak_bind_size來確定弱符號的資料偏移與大小,然後挨個計算它們的地址資訊
			if ( sortedIts[0]->image->incrementCoalIterator(*sortedIts[0]) )
				++doneCount;
			// process all matching symbols just before incrementing the lowest one that matches
			if ( sortedIts[0]->symbolMatches && !sortedIts[0]->done ) {

				ImageLoader* targetImage = NULL;
				for(int i=0; i < count; ++i) {
					if ( strcmp(iterators[i].symbolName, nameToCoalesce) == 0 ) {
						if ( iterators[i].weakSymbol ) {
							if ( targetAddr == 0 ) {
								// 按照映像的載入順序在匯出表中查詢符號的地址
								targetAddr = iterators[i].image->getAddressCoalIterator(iterators[i], context);
								if ( targetAddr != 0 )
									targetImage = iterators[i].image;
							}
						}
						else {
							targetAddr = iterators[i].image->getAddressCoalIterator(iterators[i], context);
							if ( targetAddr != 0 ) {
								targetImage = iterators[i].image;
								// strong implementation found, stop searching
								break;
							}
						}
					}
				}

				// tell each to bind to this symbol (unless already bound)
				if ( targetAddr != 0 ) {
					for(int i=0; i < count; ++i) {
						if ( strcmp(iterators[i].symbolName, nameToCoalesce) == 0 ) {
							// 繫結操作
							// 內部執行繫結的是bindLocation()
							iterators[i].image->updateUsesCoalIterator(iterators[i], targetAddr, targetImage, context);
					}
				}
				
			}
		}
}

8、執行初始化方法

執行初始化方法, 其中+load 和constructor方法就是在這裡執行。

// initializeMainExecutable 執行初始化方法，其中 +load 和 constructor 方法就是在這裡執行。
// initializeMainExecutable 內部先呼叫了動態庫的初始化方法，後呼叫主程式的初始化方法。
// 初始化主程式
void initializeMainExecutable()
{
	// record that we've reached this step
	gLinkContext.startedInitializingMainExecutable = true;

	// run initialzers for any inserted dylibs
	// 給被插入的所有的 dylibs 進行初始化 -- 呼叫 initialzers
	ImageLoader::InitializerTimingList initializerTimes[sAllImages.size()];

	initializerTimes[0].count = 0;

	const size_t rootCount = sImageRoots.size();
	if ( rootCount > 1 ) {
		// 這裡是下標1開始 排除掉了主程式的初始化
		for(size_t i=1; i < rootCount; ++i) {
			// 執行映象的初始化方法
			// 從 sImageRoots 中的第一個變數是 MainExcutable image, 
			// 因此這裡初始化的時候需要跳過第一個資料, 對其他後面插入的dylib進行呼叫
			// ImageLoader::runInitializers進行初始化
			sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
		}
	}
	
	// run initializers for main executable and everything it brings up
	// 呼叫主程式的初始化方法
	// 單獨對 main executable呼叫ImageLoader::runInitializers進行初始化
	sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
	
	// register cxa_atexit() handler to run static terminators in all loaded images when this process exits
	if ( gLibSystemHelpers != NULL ) 
		(*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL);

	// dump info if requested
	if ( sEnv.DYLD_PRINT_STATISTICS )
		ImageLoaderMachO::printStatistics((unsigned int)sAllImages.size(), initializerTimes[0]);
}

這個方法主要是執行了ImageLoader的runInitializers方法，下面我看下這個方法的實現：

runInitializers

void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
  // 初始化當前 imageLoader 中的 image映象的實際呼叫方法 ImageLoader::processInitializers
	processInitializers(context, thisThread, timingInfo, up);
	context.notifyBatch(dyld_image_state_initialized);
}

processInitializers

void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
									 InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
    // 處理當前image依賴 dylib動態庫, 呼叫 recursiveInitialization 方法!!!
	for (uintptr_t i=0; i < images.count; ++i) {
		images.images[i]->recursiveInitialization(context, thisThread, timingInfo, ups);
	}
	// If any upward dependencies remain, init them.
	if ( ups.count > 0 )
		processInitializers(context, thisThread, timingInfo, ups);
}

recursiveInitialization

// 遞迴呼叫 image 進行初始化, 先呼叫image依賴的image進行初始化. 直到自己
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread,
										  InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
// 當前ImageLoader依賴的Image還沒有初始化完, 進入if中, 如果執行完成, 直接返回
	if ( fState < dyld_image_state_dependents_initialized-1 ) {
		uint8_t oldState = fState;
		// break cycles
		// break cycles -> 這是設定當前imageLoader的state接近依賴初始化.
		fState = dyld_image_state_dependents_initialized-1;
		try {
			// initialize lower level libraries first
			// 首先初始化image底層的依賴庫
			for(unsigned int i=0; i < libraryCount(); ++i) {
				ImageLoader* dependentImage = libImage(i);
				if ( dependentImage != NULL ) {
					// don't try to initialize stuff "above" me yet
					if ( libIsUpward(i) ) {
						uninitUps.images[uninitUps.count] = dependentImage;
						uninitUps.count++;
					}
					else if ( dependentImage->fDepth >= fDepth ) {
					   // 遞迴呼叫
						dependentImage->recursiveInitialization(context, this_thread, timingInfo, uninitUps);
					}
                }
			}
			
			// 到這裡image底層的依賴庫都遞迴呼叫, 初始化完成.
			
			// let objc know we are about to initialize this image
			uint64_t t1 = mach_absolute_time();
			fState = dyld_image_state_dependents_initialized;
			oldState = fState;
			// 通知 runtime, 當前狀態發生變化 -- image的依賴已經完全載入. 
			// 注意這裡可能在runtime中註冊了狀態監聽, 註冊了callback函式, 當狀態傳送變化時,
			// 會觸發回撥函式.
			context.notifySingle(dyld_image_state_dependents_initialized, this);
			
			// initialize this image
			// 初始化當前image, `ImageLoaderMachO::doInitialization`方法內部會呼叫image  
			// 的"Initializer", 這是一個函式指標, 實際是image的初始化方法. 例如 
			// `libSystem.dylib`, 它的初始化方法就比較特殊, 我們可以參考libSystem的init.c源
			// 碼, 內部的`libsystem_initializer`函式就是初始化真正呼叫的函式
			
			 // _init_objc方法!!!!
			bool hasInitializers = this->doInitialization(context);

			// let anyone know we finished initializing this image
			fState = dyld_image_state_initialized;
			oldState = fState;
			//通知runtime, 檔期那狀態傳送變化 -- image自己已經完成初始化!!!!
			context.notifySingle(dyld_image_state_initialized, this);
			
			if ( hasInitializers ) {
				uint64_t t2 = mach_absolute_time();
				timingInfo.images[timingInfo.count].image = this;
				timingInfo.images[timingInfo.count].initTime = (t2-t1);
				timingInfo.count++;
			}

		}
		catch (const char* msg) {
			// this image is not initialized
			fState = oldState;
			recursiveSpinUnLock();
			throw;
		}
	}
}

doInitialization

bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
	CRSetCrashLogMessage2(this->getPath());

	// mach-o has -init and static initializers
	// 呼叫Mach-O的 init 和  static initializers方法
	doImageInit(context);
	doModInitFunctions(context);
	
	CRSetCrashLogMessage2(NULL);
	
	return (fHasDashInit || fHasInitializers);
}

doImageInit

獲取mach-o的init方法的地址並呼叫

void ImageLoaderMachO::doImageInit(const LinkContext& context)
{
	if ( fHasDashInit ) {
		// mach-o檔案中指令的個數
		const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
		const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
		const struct load_command* cmd = cmds;
		// 遍歷指令
		for (uint32_t i = 0; i < cmd_count; ++i) {
			switch (cmd->cmd) {
				case LC_ROUTINES_COMMAND:
					// 獲取macho_routines_command的init_address
					Initializer func = (Initializer)(((struct macho_routines_command*)cmd)->init_address + fSlide);
					// <rdar://problem/8543820&9228031> verify initializers are in image
					if ( ! this->containsAddress((void*)func) ) {
						dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath());
					}
					if ( context.verboseInit )
						dyld::log("dyld: calling -init function %p in %s\n", func, this->getPath());
					// 執行-init方法
					func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
					break;
			}
			// 計算下一個指令((char*)cmd)+cmd->cmdsize
			cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
		}
	}
}

doModInitFunctions

獲取mach-o的static initializer的地址並呼叫

void ImageLoaderMachO::doModInitFunctions(const LinkContext& context)
{
	if ( fHasInitializers ) {
		// mach-o檔案中指令的個數
		const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
		const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
		const struct load_command* cmd = cmds;
		// 遍歷所有的指令
		for (uint32_t i = 0; i < cmd_count; ++i) {
			 // 如果指令是Mach-o中的LC_SEGMENT_COMMAND
			if ( cmd->cmd == LC_SEGMENT_COMMAND ) {
				const struct macho_segment_command* seg = (struct macho_segment_command*)cmd;
				const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command));
				const struct macho_section* const sectionsEnd = &sectionsStart[seg->nsects];
				// 從sectionsStart到sectionsEnd遍歷所有的macho_section
				for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
					const uint8_t type = sect->flags & SECTION_TYPE;
					//
					if ( type == S_MOD_INIT_FUNC_POINTERS ) {
						Initializer* inits = (Initializer*)(sect->addr + fSlide);
						const size_t count = sect->size / sizeof(uintptr_t);
						for (size_t i=0; i < count; ++i) {
							// 獲取到Initializer方法
							Initializer func = inits[i];
							// <rdar://problem/8543820&9228031> verify initializers are in image
							if ( ! this->containsAddress((void*)func) ) {
								dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath());
							}
							if ( context.verboseInit )
								dyld::log("dyld: calling initializer function %p in %s\n", func, this->getPath());
							// 執行initializer方法
							func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
						}
					}
				}
			}
			// 根據指令的地址+指令大小獲取到下一個指令
			cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
		}
	}
}

9、查詢APP入口點並返回

這一步也是最後一步主要功能為：
查詢到main函式的地址，並返回。

// 9 查詢APP入口點並返回
		result = (uintptr_t)sMainExecutable->getThreadPC();
		if ( result != 0 ) {
			// main executable uses LC_MAIN, needs to return to glue in libdyld.dylib
			if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
				*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
			else
				halt("libdyld.dylib support not present for LC_MAIN");
		}
		else {
			// main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
			result = (uintptr_t)sMainExecutable->getMain();
			*startGlue = 0;
		}

getThreadPC

// 查詢主程式的LC_MAIN載入命令獲取程式的入口點,
void* ImageLoaderMachO::getThreadPC() const
{
	const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
	const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
	const struct load_command* cmd = cmds;
	for (uint32_t i = 0; i < cmd_count; ++i) {
		if ( cmd->cmd == LC_MAIN ) {
			entry_point_command* mainCmd = (entry_point_command*)cmd;
			void* entry = (void*)(mainCmd->entryoff + (char*)fMachOData);
			// <rdar://problem/8543820&9228031> verify entry point is in image
			if ( this->containsAddress(entry) )
				return entry;
			else
				throw "LC_MAIN entryoff is out of range";
		}
		cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
	}
	return NULL;
}

該方法遍歷了Load Commands 找到LC_MAIN命令的入口點地址返回，這個地址就是main函式的地址

getMain

如果getThreadPC沒有找到LC_MAIN的入口地址

// 在LC_UNIXTHREAD載入命令中去找, 找到後就跳到入口點指定的地址
void* ImageLoaderMachO::getMain() const
{
	const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
	const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
	const struct load_command* cmd = cmds;
	for (uint32_t i = 0; i < cmd_count; ++i) {
		switch (cmd->cmd) {
			case LC_UNIXTHREAD:
			{
			#if __i386__
				const i386_thread_state_t* registers = (i386_thread_state_t*)(((char*)cmd) + 16);
				void* entry = (void*)(registers->eip + fSlide);
			#elif __x86_64__
				const x86_thread_state64_t* registers = (x86_thread_state64_t*)(((char*)cmd) + 16);
				void* entry = (void*)(registers->rip + fSlide);
			#elif __arm__
				const arm_thread_state_t* registers = (arm_thread_state_t*)(((char*)cmd) + 16);
				void* entry = (void*)(registers->__pc + fSlide);
			#elif __arm64__
				const arm_thread_state64_t* registers = (arm_thread_state64_t*)(((char*)cmd) + 16);
				void* entry = (void*)(registers->__pc + fSlide);
			#else
				#warning need processor specific code
			#endif
				// <rdar://problem/8543820&9228031> verify entry point is in image
				if ( this->containsAddress(entry) ) {
					return entry;
				}
			}
			break;
		}
		cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
	}
	throw "no valid entry point";
}

objc_init

從上面的步驟描述中我們知道實際上objc_init是在ImageLoaderMachO::doModInitFunctions時就被呼叫了，我們先來看程式碼

doModInitFunctions

// doModInitFunctions方法部分程式碼
void ImageLoaderMachO::doModInitFunctions(const LinkContext& context)
{
	for (size_t i=0; i < count; ++i) {
		// 獲取到Initializer方法
		Initializer func = inits[i];
		// <rdar://problem/8543820&9228031> verify initializers are in image
		if ( ! this->containsAddress((void*)func) ) {
			dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath());
		}
		if ( context.verboseInit )
			dyld::log("dyld: calling initializer function %p in %s\n", func, this->getPath());
		// 執行initializer方法
		func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
	}
}

doModInitFunctions方法實際上呼叫了Initializer方法，對對一個動態庫進行初始化是通過_libdispatch_init方法進行的我們來看下這個方法：

_libdispatch_init

void
libdispatch_init(void)
{

	_dispatch_hw_config_init();
	_dispatch_time_init();
	_dispatch_vtable_init();
	_os_object_init();
	_voucher_init();
	_dispatch_introspection_init();
}

上面的方法我們看到，其中呼叫了_os_object_init方法，然後我們在看下這個方法的實現：

_os_object_init

void
_os_object_init(void)
{
   // 省略....
	_objc_init();
	// 省略....
}

從上面的程式碼中我們看到，在_os_object_init方法中呼叫了我們熟知的_objc_init方法，至此執行時就開始。

_objc_init

void _objc_init(void)
{
  // 省略程式碼...
	/*
	 僅供objc執行時使用，註冊在對映、取消對映和初始化objc映像呼叫的處理程式。dyld將使用包含objc-image-info回撥給`mapped`.
	 這些dylibs將自動引用計數，因此objc將不再需要呼叫dlopen()防止未載入。
	 在呼叫_dyld_objc_notify_register()期間，dyld將呼叫 `mapped` 在已經載入好 images，稍後dlopen()。
	 在調動init的時候也會呼叫`mapped`,在dyld呼叫的時候，也會呼叫init函式
	 
	 在呼叫任何images +load方法時候
	 */
    _dyld_objc_notify_register(&map_images, load_images, unmap_image);
}

這個方法中，主要是註冊了map_images，load_images方法.

map_images

void 
map_images_nolock(unsigned mhCount, const char * const mhPaths[],
                  const struct mach_header * const mhdrs[])
{
    if (hCount > 0) {
        _read_images(hList, hCount, totalClasses, unoptimizedTotalClasses);
    }

}

load_images

void
load_images(const char *path __unused, const struct mach_header *mh)
{

    // Discover load methods
    {
        mutex_locker_t lock2(runtimeLock);
        //載入 class+load 和category+load方法
        prepare_load_methods((const headerType *)mh);
    }

    // Call +load methods (without runtimeLock - re-entrant)
    //執行 class+load 和category+load方法
    call_load_methods();
}

總結

總結上述的整個過程便是在主工程的main函式開始執行之前系統做的所有操作，其中有一些步驟和環境我也不是太清楚，因此需要進一步的完善和梳理。

參考文章

libdispatch原始碼
 dyld與ObjC
iOS App啟動時發生了什麼?
dyld載入應用啟動原理詳解

專案啟動，main函式之前的程式碼執行兩次 restartedMain
2019-04-14
AI函式REST
iOS 截圖的那些事兒
2018-06-03
iOS
iOS App 開發的那些事兒 2：如何搭建合適的框架
2018-06-26
iOSAPP框架
Android 應用啟動那些事兒，Application? Context？
2019-07-29
AndroidAPPContext
WPF啟動流程-自己手寫Main函式
2020-09-27
AI函式
https的那些事兒
2019-02-21
HTTP
webpack的那些事兒
2019-05-12
Web
main函式的入口函式
2019-05-12
AI函式
PHP那些事兒
2019-02-16
PHP
Redis那些事兒
2019-02-16
Redis
babel那些事兒
2019-03-14
Babel
web移動端佈局的那些事兒
2019-03-03
Web
Eval家族的那些事兒
2019-03-30
iOS記憶體管理的那些事兒-原理及實現
2019-03-03
iOS記憶體
Cocos Creator 中的動作系統那些事兒
2019-05-31
iOS CollectionView 的那些事
2019-04-04
iOSView
Erlang那些事兒第3回之我是函式(fun),萬物之源MFA
2021-01-03
函式
雲原生java的那些事兒
2019-03-01
Java
util.promisify 的那些事兒
2018-10-17
HTTP 快取的那些事兒
2018-08-21
HTTP快取
漏洞檢測的那些事兒
2020-08-19
關於 sudo 的那些事兒
2019-12-19
面試的那些事兒--01
2021-03-10
面試
小程式app.js裡能做的那些事兒0_0
2019-03-28
APPJS
h5 和native 互動那些事兒
2018-11-06
H5
分散式系統的那些事兒 - SOA架構體系
2021-09-09
分散式架構
軟體自動化測試工具的那些事兒
2020-03-03
MySQL優化那些事兒
2019-03-02
MySql優化
網路安全那些事兒
2018-11-08
說說RCE那些事兒
2020-08-19
C語言那些事兒
2020-04-04
C語言
PHP 閉包那些事兒
2019-02-16
PHP
字元編碼那些事兒
2021-09-09
字元
聊聊瀏覽器的那些事兒
2019-02-15
瀏覽器
綠帽子水管工的那些事兒
2019-10-15
Filebeat 收集日誌的那些事兒
2020-06-18
[apue] 等待子程式的那些事兒
2019-07-08
我與軟考的那些事兒
2018-03-25

iOS APP啟動-Main函式之前的那些事兒

__dyld_start

dyldbootstrap::start

dyld::_main

1 設定執行環境，處理環境變數

sExecPath

processRestricted

getHostInfo

2 初始化主程式

addDyldImageToUUIDList

instantiateFromLoadedImage

3 載入共享快取

checkSharedRegionDisable

mapSharedCache

checkVersionedPaths

4、載入插入的動態庫

5、連結主程式

Link

6、連結插入的動態庫

link

registerInterposing

applyInterposing

7 執行弱符號繫結

8、執行初始化方法

runInitializers

processInitializers

recursiveInitialization

doInitialization

doImageInit

doModInitFunctions

9、查詢APP入口點並返回

getThreadPC

getMain

objc_init

doModInitFunctions

_libdispatch_init

_os_object_init

_objc_init

map_images

load_images

總結

參考文章

相關文章