一:背景
1. 講故事
上個月有位朋友通過部落格園的短訊息找到我,說他的程式存在記憶體溢位情況,尋求如何解決。
要解決還得通過 windbg 分析啦。
二:Windbg 分析
1. 為什麼會記憶體溢位
大家都知道記憶體溢位對應著 .NET 中的 OutOfMemoryException
異常,這種異常有可能是託管程式碼手工丟擲的,也有可能是CLR層面丟擲的,言外之意就是可以通過兩種方式排查。
- 託管執行緒是否掛載著異常?
0:000> !t
ThreadCount: 23
UnstartedThread: 0
BackgroundThread: 5
PendingThread: 0
DeadThread: 17
Hosted Runtime: no
Lock
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception
0 1 362c 00fac868 26020 Preemptive 7ED701A0:00000000 00fa6b60 0 STA
5 2 2d70 00fbeba0 2b220 Preemptive 7EBA7AC0:00000000 00fa6b60 0 MTA (Finalizer)
7 3 3264 061c8890 102a220 Preemptive 00000000:00000000 00fa6b60 0 MTA (Threadpool Worker)
17 15 3f98 19682b90 202b220 Preemptive 7EBB0830:00000000 00fa6b60 0 MTA
XXXX 16 0 2845fb00 35820 Preemptive 00000000:00000000 00fa6b60 0 Ukn
18 14 a7c 2842b1c8 202b220 Preemptive 00000000:00000000 00fa6b60 0 MTA
XXXX 6 0 2c9b3778 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 18 0 288a1318 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 23 0 288a22f0 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 10 0 2ccf3550 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 21 0 288a1860 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 12 0 288a1da8 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 11 0 2c993640 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 8 0 2ccf3a98 35820 Preemptive 00000000:00000000 00fa6b60 0 Ukn
XXXX 9 0 2ccf2030 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 7 0 2c9aed88 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 26 0 28898308 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 25 0 2c492c68 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 4 0 2c993b88 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 20 0 2c9af2d0 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 17 0 2c9afd60 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
XXXX 24 0 2c9b1280 1039820 Preemptive 00000000:00000000 00fa6b60 0 Ukn (Threadpool Worker)
23 22 2658 2c9b02a8 1029220 Preemptive 7ED5BFF8:00000000 00fa6b60 0 MTA (Threadpool Worker)
從輸出資訊看,這些執行緒並沒有掛載任何託管異常,我去。。。
- 是否在 CLR 上丟擲
這主要是看 託管堆(heap)
上的記憶體分配或者gc回收造成的記憶體不足,可以用 !ao
命令。
0:000> !ao
There was no managed OOM due to allocations on the GC heap
從輸出資訊看也沒有任何異常,尷尬了???。。。 尼瑪,那到底是因為什麼呢?
2. 探索溢位原因
出現這種尷尬情況,我只能懷疑生成這個dump的時候並沒有get到那個點,或者是我的知識邊界有限,不過天無絕人之路,不在那個 點
也肯定在那個 點
附近,對吧,接下來用 !address -summary
看一下記憶體使用的歸類資訊。
0:000> !address -summary
--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
<unknown> 1520 4c185000 ( 1.189 GB) 65.57% 59.45%
Image 4306 1f140000 ( 497.250 MB) 26.78% 24.28%
Free 1133 bf17000 ( 191.090 MB) 9.33%
Heap 617 7626000 ( 118.148 MB) 6.36% 5.77%
Stack 72 1740000 ( 23.250 MB) 1.25% 1.14%
Other 34 7b000 ( 492.000 kB) 0.03% 0.02%
TEB 24 30000 ( 192.000 kB) 0.01% 0.01%
PEB 1 3000 ( 12.000 kB) 0.00% 0.00%
--- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_MAPPED 549 34b60000 ( 843.375 MB) 45.42% 41.18%
MEM_PRIVATE 1718 20424000 ( 516.141 MB) 27.80% 25.20%
MEM_IMAGE 4307 1f155000 ( 497.332 MB) 26.78% 24.28%
--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_COMMIT 4904 66ddd000 ( 1.607 GB) 88.64% 80.37%
MEM_RESERVE 1670 d2fc000 ( 210.984 MB) 11.36% 10.30%
MEM_FREE 1133 bf17000 ( 191.090 MB) 9.33%
--- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal
PAGE_READONLY 2272 382cf000 ( 898.809 MB) 48.41% 43.89%
PAGE_READWRITE 1572 1eead000 ( 494.676 MB) 26.64% 24.15%
PAGE_EXECUTE_READ 218 dd59000 ( 221.348 MB) 11.92% 10.81%
PAGE_WRITECOPY 449 133e000 ( 19.242 MB) 1.04% 0.94%
PAGE_EXECUTE_READWRITE 188 ab4000 ( 10.703 MB) 0.58% 0.52%
PAGE_NOACCESS 156 9c000 ( 624.000 kB) 0.03% 0.03%
PAGE_READWRITE | PAGE_GUARD 48 78000 ( 480.000 kB) 0.03% 0.02%
PAGE_READWRITE | PAGE_WRITECOMBINE 1 2000 ( 8.000 kB) 0.00% 0.00%
--- Largest Region by Usage ----------- Base Address -------- Region Size ----------
<unknown> 1d200000 a001000 ( 160.004 MB)
Image fed1000 36e4000 ( 54.891 MB)
Free 33dfe000 1082000 ( 16.508 MB)
Heap 3da84000 a1b000 ( 10.105 MB)
Stack 1a10000 fd000 (1012.000 kB)
Other 7fa40000 33000 ( 204.000 kB)
TEB a4c000 3000 ( 12.000 kB)
PEB a3d000 3000 ( 12.000 kB)
從上面的 MEM_COMMIT=1.607 GB 80.37%
資訊看,當前記憶體佔用 1.6G
,佔比 80.37%
,可以看出它受到了一個 2G記憶體
的限制,而且從 !t
輸出中的記憶體地址看,當前是 32bit 程式,所以這是一個經典的: 64系統跑著32位程式被2G記憶體限制 的問題。
3. 如何突破 2G 限制
要尋找答案,還得看最權威的 MSDN: https://docs.microsoft.com/en-us/windows/win32/memory/memory-limits-for-windows-releases?redirectedfrom=MSDN
破局
還得設定程式的 IMAGE_FILE_LARGE_ADDRESS_AWARE
標記。
關於具體怎麼設定,我找了三種方法。
- 使用 LargeAddressAware 安裝包
參見 github: https://github.com/KirillOsenkov/LargeAddressAware
- 使用 editbin
可以在 vs 的生成事件中輸入 editbin /largeaddressaware $(TargetPath)
。
- 使用程式碼方式
這種可以直接給生成好的 exe 增加 LargeAddressAware
標記,除了標記,還能檢測,??
using System;
using System.IO;
namespace PEFile
{
public class LargeAddressAware
{
public static bool IsLargeAddressAware(string filePath)
{
bool isLargeAddressAware = false;
PrepareStream(filePath, (stream, binaryReader) => isLargeAddressAware = (binaryReader.ReadInt16() & 0x20) != 0);
return isLargeAddressAware;
}
public static void SetLargeAddressAware(string filePath)
{
PrepareStream(filePath, (stream, binaryReader) =>
{
var value = binaryReader.ReadInt16();
if ((value & 0x20) == 0)
{
value = (short)(value | 0x20);
stream.Position -= 2;
var binaryWriter = new BinaryWriter(stream);
binaryWriter.Write(value);
binaryWriter.Flush();
}
});
}
private static void PrepareStream(string filePath, Action<Stream, BinaryReader> action)
{
using (var stream = new FileStream(filePath, FileMode.Open, FileAccess.ReadWrite, FileShare.Read))
{
if (stream.Length < 0x3C)
{
return;
}
var binaryReader = new BinaryReader(stream);
// MZ header
if (binaryReader.ReadInt16() != 0x5A4D)
{
return;
}
stream.Position = 0x3C;
var peHeaderLocation = binaryReader.ReadInt32();
stream.Position = peHeaderLocation;
// PE header
if (binaryReader.ReadInt32() != 0x4550)
{
return;
}
stream.Position += 0x12;
action(stream, binaryReader);
}
}
}
}
三:總結
總的來說,2G 記憶體限制
是一個 32bit 程式所必須面對的問題,知道了就好解決了,最後有一個問題要解釋下,為什麼 commit 記憶體高達 1.6G
,這是因為醫療類的軟體,大多是 FastReport + DevExpress
這些重量級的經典搭配以及大量的圖片資源佔用了太多 native memory。