對於ARM中核心如何在啟動的時候設定高低端記憶體的分界線(也是邏輯地址與虛擬地址分界線(虛擬地址)減去那個固定的偏移),這裡我稍微引導下(核心分析使用Linux-3.0):
首先定位設定核心虛擬地址起始位置(也就是核心邏輯地址末端+1的地址)的檔案:init.c (arch\arm\mm),在這個檔案中的void __init bootmem_init(void)函式如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
void __init bootmem_init(void) { unsigned long min, max_low, max_high; max_low = max_high = 0; find_limits(&min, &max_low, &max_high); arm_bootmem_init(min, max_low); /* * Sparsemem tries to allocate bootmem in memory_present(), * so must be done after the fixed reservations */ arm_memory_present(); /* * sparse_init() needs the bootmem allocator up and running. */ sparse_init(); /* * Now free the memory - free_area_init_node needs * the sparse mem_map arrays initialized by sparse_init() * for memmap_init_zone(), otherwise all PFNs are invalid. */ arm_bootmem_free(min, max_low, max_high); high_memory = __va(((phys_addr_t)max_low << PAGE_SHIFT) - 1) + 1; /* * This doesn't seem to be used by the Linux memory manager any * more, but is used by ll_rw_block. If we can get rid of it, we * also get rid of some of the stuff above as well. * * Note: max_low_pfn and max_pfn reflect the number of _pages_ in * the system, not the maximum PFN. */ max_low_pfn = max_low - PHYS_PFN_OFFSET; max_pfn = max_high - PHYS_PFN_OFFSET; } |
這個high_memory = __va(((phys_addr_t)max_low << PAGE_SHIFT) – 1) + 1;語句就是關鍵。從這裡可以知道max_low就是高階記憶體的起始地址(實體地址)。那麼這個max_low是如何得到的?其實看上面的程式碼可以推測出,他其實是在find_limits(&min, &max_low, &max_high);中(在同一個檔案中)被設定的:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
static void __init find_limits(unsigned long *min, unsigned long *max_low, unsigned long *max_high) { struct meminfo *mi = &meminfo; int i; *min = -1UL; *max_low = *max_high = 0; for_each_bank (i, mi) { struct membank *bank = &mi->bank[i]; unsigned long start, end; start = bank_pfn_start(bank); end = bank_pfn_end(bank); if (*min > start) *min = start; if (*max_high < end) *max_high = end; if (bank->highmem) continue; if (*max_low < end) *max_low = end; } } |
這個函式的意思很明顯:通過掃描struct meminfo *mi = &meminfo;(結構體meminfo的陣列)中的所有資訊,設定三個指標所指的變數:
min :記憶體實體地址起始
max_low :低端記憶體區實體地址末端
max_high :高階記憶體區實體地址末端
從上面可以看出,max_low和max_high所儲存的地址不同就是由於bank->highmem造成的,它是記憶體bank被設為高階記憶體的依據:
“如果這個記憶體bank是高階記憶體(bank->highmem != 0),跳過max_low = end;語句,max_low和max_high將不同(結果實際上是max_high > max_low);
否則假設沒有一個記憶體bank是高階記憶體(所有bank->highmem == 0)max_low和max_high必然一致(高階記憶體大小為0)”
當然要實現這個函式的功能,必須保證meminfo所指陣列中的所有bank是按照地址資料從小到大排序好的哦~~。但是這個大家不用擔心,後面會看到的:)
經過上面的跟蹤,焦點集中到了全域性變數(同一個檔案中):
/*
* This keeps memory configuration data used by a couple memory
* initialization functions, as well as show_mem() for the skipping
* of holes in the memory map. It is populated by arm_add_memory().
*/
struct meminfo meminfo;
這個結構體的定義(setup.h (arch\arm\include\asm)):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
/* * Memory map description */ #define NR_BANKS 8 /*現在ARM最大隻支援到8個bank哦~*/ struct membank { phys_addr_t start; unsigned long size; unsigned int highmem; /*我們關心的變數*/ }; struct meminfo { int nr_banks; struct membank bank[NR_BANKS]; /*我們關心的陣列*/ }; extern struct meminfo meminfo; #define for_each_bank(iter,mi) \ for (iter = 0; iter < (mi)->nr_banks; iter++) #define bank_pfn_start(bank) __phys_to_pfn((bank)->start) #define bank_pfn_end(bank) __phys_to_pfn((bank)->start + (bank)->size) #define bank_pfn_size(bank) ((bank)->size >> PAGE_SHIFT) #define bank_phys_start(bank) (bank)->start #define bank_phys_end(bank) ((bank)->start + (bank)->size) #define bank_phys_size(bank) (bank)->size |
只要找到初始化這個全域性變數並完成排序的地方,就可以知道高階記憶體是如何配置的了!!OK,明確目標,go on~~~
通過查詢程式碼,我們可以在setup.c (arch\arm\kernel)這個檔案中找到相關的程式碼。在系統啟動早期會執行的函式(具體的順序你可以自行分析下ARM核心的啟動流程,以後我也會寫下)中有這樣一個函式:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
void __init setup_arch(char **cmdline_p) { struct machine_desc *mdesc; unwind_init(); setup_processor(); mdesc = setup_machine_fdt(__atags_pointer); if (!mdesc) mdesc = setup_machine_tags(machine_arch_type); machine_desc = mdesc; machine_name = mdesc->name; if (mdesc->soft_reboot) reboot_setup("s"); init_mm.start_code = (unsigned long) _text; init_mm.end_code = (unsigned long) _etext; init_mm.end_data = (unsigned long) _edata; init_mm.brk = (unsigned long) _end; /* 填充cmd_line以備後用,維護boot_command_line */ strlcpy(cmd_line, boot_command_line, COMMAND_LINE_SIZE); /* 拷貝boot_command_line中的資料到cmd_line */ *cmdline_p = cmd_line; parse_early_param(); /* 分析boot_command_line(核心啟動引數字串)中的資料, * 其中就分析了mem=size@start引數初始化了struct meminfo meminfo; * 同時如果有vmalloc=size引數也會初始化 vmalloc_min */ sanity_check_meminfo(); /* 在此處設定struct meminfo meminfo中每個bank中的highmem變數, * 通過vmalloc_min確定每個bank中的記憶體是否屬於高階記憶體 */ arm_memblock_init(&meminfo, mdesc); /* 在此處排序按地址資料從小到大排序 */ paging_init(mdesc); request_standard_resources(mdesc); unflatten_device_tree(); #ifdef CONFIG_SMP if (is_smp()) smp_init_cpus(); #endif reserve_crashkernel(); cpu_init(); tcm_init(); #ifdef CONFIG_MULTI_IRQ_HANDLER handle_arch_irq = mdesc->handle_irq; #endif #ifdef CONFIG_VT #if defined(CONFIG_VGA_CONSOLE) conswitchp = &vga_con; #elif defined(CONFIG_DUMMY_CONSOLE) conswitchp = &dummy_con; #endif #endif early_trap_init(); if (mdesc->init_early) mdesc->init_early(); } |
在上面的註釋中,我已經表明了重點和解析,下面我細化下:
(1)獲取引數部分
通過parse_early_param();函式可以解析核心啟動引數中的許多字串,但是對於我們這次分析記憶體的話主要是分析以下兩個引數:
mem=size@start引數,她為初始化struct meminfo meminfo;(我們一直關注的記憶體資訊哦~)提供資訊。具體的獲取資訊的函式(同樣位於setup.c (arch\arm\kernel)):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
int __init arm_add_memory(phys_addr_t start, unsigned long size) { struct membank *bank = &meminfo.bank[meminfo.nr_banks]; if (meminfo.nr_banks >= NR_BANKS) { printk(KERN_CRIT "NR_BANKS too low, " "ignoring memory at 0x%08llx\n", (long long)start); return -EINVAL; } /* * Ensure that start/size are aligned to a page boundary. * Size is appropriately rounded down, start is rounded up. */ size -= start & ~PAGE_MASK; bank->start = PAGE_ALIGN(start); bank->size = size & PAGE_MASK; /* * Check whether this memory region has non-zero size or * invalid node number. */ if (bank->size == 0) return -EINVAL; meminfo.nr_banks++; return 0; } /* * Pick out the memory size. We look for mem=size@start, * where start and size are "size[KkMm]" */ static int __init early_mem(char *p) { static int usermem __initdata = 0; unsigned long size; phys_addr_t start; char *endp; /* * If the user specifies memory size, we * blow away any automatically generated * size. */ if (usermem == 0) { usermem = 1; meminfo.nr_banks = 0; } start = PHYS_OFFSET; size = memparse(p, &endp); if (*endp == '@') start = memparse(endp + 1, NULL); arm_add_memory(start, size); return 0; } early_param("mem", early_mem); |
vmalloc=size引數,她為初始化vmalloc_min(需要保留的核心虛擬地址空間大小,也就是這個核心虛擬地址空間中除去邏輯地址空間和必要的防止越界的保護空洞後最少要預留的地址空間)提供資訊。具體的實現函式(位於mmu.c (arch\arm\mm)):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
static void * __initdata vmalloc_min = (void *)(VMALLOC_END - SZ_128M); /* 預設值,使得核心虛擬地址保留出128MB的空間以備後用 */ /* 這裡順便提一下 VMALLOC_END ,他和晶片構架相關, *不一定是0xffffffff,比如2410是0xF6000000UL,pxa是0xe8000000UL*/ /* * vmalloc=size forces the vmalloc area to be exactly 'size' * bytes. This can be used to increase (or decrease) the vmalloc * area - the default is 128m. */ static int __init early_vmalloc(char *arg) { unsigned long vmalloc_reserve = memparse(arg, NULL); if (vmalloc_reserve < SZ_16M) { vmalloc_reserve = SZ_16M; printk(KERN_WARNING "vmalloc area too small, limiting to %luMB\n", vmalloc_reserve >> 20); } /* 資料檢查,最小值=16MB */ if (vmalloc_reserve > VMALLOC_END - (PAGE_OFFSET + SZ_32M)) { vmalloc_reserve = VMALLOC_END - (PAGE_OFFSET + SZ_32M); printk(KERN_WARNING "vmalloc area is too big, limiting to %luMB\n", vmalloc_reserve >> 20); } /* 資料檢查,最大值為從這個核心虛擬地址空間減去32MB, * 也就是隻有32MB的邏輯地址空間,其他地址全部保留備用 */ vmalloc_min = (void *)(VMALLOC_END - vmalloc_reserve); /* vmalloc_min其實就是以後可用於對映分配的核心虛擬地址空間(不包含邏輯地址)最小值 */ return 0; } early_param("vmalloc", early_vmalloc); |
(2)在獲得了必要的資訊(初始化好struct meminfo meminfo和vmalloc_min)後,核心通過sanity_check_meminfo函式自動去通過vmalloc_min資訊來初始化每個meminfo.bank[?]中的highmem成員。此過程中如果有必要,將可能會改變meminfo中的bank陣列。處理函式位於mmu.c (arch\arm\mm):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
static phys_addr_t lowmem_limit __initdata = 0; void __init sanity_check_meminfo(void) { int i, j, highmem = 0; for (i = 0, j = 0; i < meminfo.nr_banks; i++) { struct membank *bank = &meminfo.bank[j]; *bank = meminfo.bank[i]; #ifdef CONFIG_HIGHMEM if (__va(bank->start) >= vmalloc_min || __va(bank->start) < (void *)PAGE_OFFSET) highmem = 1; bank->highmem = highmem; /* * Split those memory banks which are partially overlapping * the vmalloc area greatly simplifying things later. */ if (__va(bank->start) < vmalloc_min && bank->size > vmalloc_min - __va(bank->start)) { if (meminfo.nr_banks >= NR_BANKS) { printk(KERN_CRIT "NR_BANKS too low, " "ignoring high memory\n"); } else { memmove(bank + 1, bank, (meminfo.nr_banks - i) * sizeof(*bank)); meminfo.nr_banks++; i++; bank[1].size -= vmalloc_min - __va(bank->start); bank[1].start = __pa(vmalloc_min - 1) + 1; bank[1].highmem = highmem = 1; j++; } bank->size = vmalloc_min - __va(bank->start); } #else bank->highmem = highmem; /* * Check whether this memory bank would entirely overlap * the vmalloc area. */ if (__va(bank->start) >= vmalloc_min || __va(bank->start) < (void *)PAGE_OFFSET) { printk(KERN_NOTICE "Ignoring RAM at %.8llx-%.8llx " "(vmalloc region overlap).\n", (unsigned long long)bank->start, (unsigned long long)bank->start + bank->size - 1); continue; } /* * Check whether this memory bank would partially overlap * the vmalloc area. */ if (__va(bank->start + bank->size) > vmalloc_min || __va(bank->start + bank->size) < __va(bank->start)) { unsigned long newsize = vmalloc_min - __va(bank->start); printk(KERN_NOTICE "Truncating RAM at %.8llx-%.8llx " "to -%.8llx (vmalloc region overlap).\n", (unsigned long long)bank->start, (unsigned long long)bank->start + bank->size - 1, (unsigned long long)bank->start + newsize - 1); bank->size = newsize; } #endif if (!bank->highmem && bank->start + bank->size > lowmem_limit) lowmem_limit = bank->start + bank->size; j++; } #ifdef CONFIG_HIGHMEM if (highmem) { const char *reason = NULL; if (cache_is_vipt_aliasing()) { /* * Interactions between kmap and other mappings * make highmem support with aliasing VIPT caches * rather difficult. */ reason = "with VIPT aliasing cache"; } if (reason) { printk(KERN_CRIT "HIGHMEM is not supported %s, ignoring high memory\n", reason); while (j > 0 && meminfo.bank[j - 1].highmem) j--; } } #endif meminfo.nr_banks = j; memblock_set_current_limit(lowmem_limit); } |
(3)最後必須做的就是排序了,完成了這個工作就可以完全被我們上面提到的find_limits函式使用了,而這個工作就放在了接下來的arm_memblock_init(&meminfo, mdesc);中的一開頭:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
static int __init meminfo_cmp(const void *_a, const void *_b) { const struct membank *a = _a, *b = _b; long cmp = bank_pfn_start(a) - bank_pfn_start(b); return cmp < 0 ? -1 : cmp > 0 ? 1 : 0; } void __init arm_memblock_init(struct meminfo *mi, struct machine_desc *mdesc) { int i; sort(&meminfo.bank, meminfo.nr_banks, sizeof(meminfo.bank[0]), meminfo_cmp, NULL); /* 極好用的排序函式 */ memblock_init(); for (i = 0; i < mi->nr_banks; i++) memblock_add(mi->bank[i].start, mi->bank[i].size); /* Register the kernel text, kernel data and initrd with memblock. */ #ifdef CONFIG_XIP_KERNEL memblock_reserve(__pa(_sdata), _end - _sdata); #else memblock_reserve(__pa(_stext), _end - _stext); #endif #ifdef CONFIG_BLK_DEV_INITRD if (phys_initrd_size && !memblock_is_region_memory(phys_initrd_start, phys_initrd_size)) { pr_err("INITRD: 0x%08lx+0x%08lx is not a memory region - disabling initrd\n", phys_initrd_start, phys_initrd_size); phys_initrd_start = phys_initrd_size = 0; } if (phys_initrd_size && memblock_is_region_reserved(phys_initrd_start, phys_initrd_size)) { pr_err("INITRD: 0x%08lx+0x%08lx overlaps in-use memory region - disabling initrd\n", phys_initrd_start, phys_initrd_size); phys_initrd_start = phys_initrd_size = 0; } if (phys_initrd_size) { memblock_reserve(phys_initrd_start, phys_initrd_size); /* Now convert initrd to virtual addresses */ initrd_start = __phys_to_virt(phys_initrd_start); initrd_end = initrd_start + phys_initrd_size; } #endif arm_mm_memblock_reserve(); arm_dt_memblock_reserve(); /* reserve any platform specific memblock areas */ if (mdesc->reserve) mdesc->reserve(); memblock_analyze(); memblock_dump_all(); } |
通過上面的分析,整個高低端記憶體是如何確定的基本就清晰了,這裡總結一下:
ARM構架中,高-低段記憶體是核心通過核心啟動引數( mem=size@start和vmalloc=size)來自動配置的,如果沒有特殊去配置他,那麼在普通的ARM系統中是不會有高階記憶體存在的。除非你係統的RAM很大或vmalloc配置得很大,就很可能出現高階記憶體。
以上是我對高-低端記憶體學習時跟蹤程式碼的備忘,如果大家在其中發現什麼不對的地方,歡迎拍磚、糾正~~謝謝~