Memory Leak Process Heap 361 If we want to dump all heap entries with their corresponding stack traces we can use !heap -k -h command.. MISSING THREAD Sometimes it is possible that a p
Trang 1Memory Leak (Process Heap) 361
If we want to dump all heap entries with their corresponding stack traces we can
use !heap -k -h <heap address> command
Note: sometimes all these commands don’t work In such cases we can use old
Windows 2000 extension (page 182)
Some prefer to use umdh.exe and get text file logs but the advantage of
embed-ding heap allocation stack traces in a crash dump is that we are not concerned with
sending and configuring symbol files at a customer site
When analyzing heap various pageheap options !heap -p are useful such as
(taken from WinDbg help):
-t[c|s] [Traces]
“Causes the debugger to display the collected traces of the heavy heap users
Traces specifies the number of traces to display; the default is four If there are more
traces than the specified number, the earliest traces are displayed If -t or -tc is used, the
traces are sorted by count usage If -ts is used, the traces are sorted by size.”
We can also use Microsoft Debug Diagnostics tool:
http://blogs.msdn.com/debugdiag/
Trang 2MISSING THREAD
Sometimes it is possible that a process crash dump doesn’t have all usual threads
inside For example, we expect at least 4 threads including the main process thread but
in the dump we see only 3 If we know that some access violations were reported in the
event log before (not necessarily for the same PID) we might suspect that one of threads
had been terminated due to some reason I call this pattern Missing Thread
In order to simulate this problem I created a small multithreaded program in
If there is a command line argument then the main thread simulates access
viola-tion and finishes in the excepviola-tion handler In order to use SEH excepviola-tions with C++
try/catch blocks you have to enable /EHa option in C++ Code Generation properties:
Trang 3Missing Thread 363
If we run the program without command line parameter and take a manual
dump from it we would see 2 threads:
0 Id: 1208.fdc Suspend: 1 Teb: 7efdd000 Unfrozen
1 Id: 1208.102c Suspend: 1 Teb: 7efda000 Unfrozen
0:000> dd 7efdd000 l4
7efdd000 0012ff64 00130000 0012e000 00000000
I also dumped TEB of the main thread However if we run the program with any
command line parameter and look at its manual dump we would see only one thread
with the main thread missing:
Trang 40 Id: 1004.12e8 Suspend: 1 Teb: 7efda000 Unfrozen
If we try to dump TEB address and stack data from the missing main thread we
would see that the memory was already decommitted:
The same effect can be achieved in the similar program that exits the thread in
the custom unhandled exception filter:
Trang 5The solution to catch an exception that results in a thread termination would be
to run the program under WinDbg or any other debugger:
(df0.12f0): Break instruction exception - code 80000003 (first chance)
eax=7d600000 ebx=7efde000 ecx=00000005 edx=00000020 esi=7d6a01f4
ModLoad: 77ba0000 77bfa000 C:\W2K3\syswow64\msvcrt.dll
ModLoad: 00410000 004ab000 C:\W2K3\syswow64\ADVAPI32.dll
ModLoad: 7da20000 7db00000 C:\W2K3\syswow64\RPCRT4.dll
ModLoad: 7d8d0000 7d920000 C:\W2K3\syswow64\Secur32.dll
(df0.12f0): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling
This exception may be expected and handled
eax=000007a0 ebx=7d4d8df9 ecx=78b842d9 edx=00000000 esi=00000002
Trang 6If live debugging is not possible and we are interested in crash dumps saved
upon a first chance exception before it is processed in an exception handler we can also
use MS userdump tool after we install it and enable All Exceptions in the Process
Monitoring Rules dialog box Another tool can be used is ADPlus in crash mode from
Debugging Tools for Windows
Trang 7Unknown Component 367
UNKNOWN COMPONENT Sometimes we suspect that a problem was caused by some module but WinDbg lmv command doesn’t show the company name and other verbose information for it and Google search has no results for the file name I call this pattern Unknown Compo-nent In such cases additional information can be obtained by dumping the module re-source section or the whole module address range and looking for ASCII and UNICODE strings For example (byte values in db output are omitted for clarity): 2: kd> lmv m driver start end module name f5022000 f503e400 driver (deferred) Image path: \SystemRoot\System32\drivers\driver.sys Image name: driver.sys Timestamp: Tue Jun 12 11:33:16 2007 (466E766C) CheckSum: 00021A2C ImageSize: 0001C400 Translations: 0000.04b0 0000.04e0 0409.04b0 0409.04e0 2: kd> db f5022000 f503e400 f5022000 MZ
f5022010 .@
f5022020
f5022030
f5022040 .! L.!Th f5022050 is program canno f5022060 t be run in DOS f5022070 mode $
f5022080 g,._.B._.B._.B f5022090 _.C.=.B %Q.X.B f50220a0 _.B.].B.Y%H.|.B f50220b0 D.^.B.Rich_.B f50220c0 .PE L
f50220d0 lvnF
f503ce30
f503ce40
f503ce50
f503ce60 .0
f503ce70
f503ce80 H
f503ce90 .4 V f503cea0 S._.V.E.R.S.I.O f503ceb0 N._.I.N.F.O
f503cec0
Trang 8We see that CompanyName is MyComp AG, FileDescription is My Big Product
Hook and FileVersion is 5.0.1
In our example the same information can be retrieved by dumping the image file
header and then finding and dumping the resource section:
2: kd> lmv m driver
start end module name
f5022000 f503e400 driver (deferred)
Image path: \SystemRoot\System32\drivers\driver.sys
Image name: driver.sys
Timestamp: Tue Jun 12 11:33:16 2007 (466E766C)
CheckSum: 00021A2C
ImageSize: 0001C400
Translations: 0000.04b0 0000.04e0 0409.04b0 0409.04e0
Trang 9Unknown Component 369
2: kd> !dh f5022000 -f
File Type: EXECUTABLE IMAGE
FILE HEADER VALUES
14C machine (i386)
6 number of sections
466E766C time date stamp Tue Jun 12 11:33:16 2007
0 file pointer to symbol table
32 bit word machine
OPTIONAL HEADER VALUES
10B magic #
6.00 linker version
190A0 size of code
30A0 size of initialized data
0 size of uninitialized data
1A340 address of entry point
00100000 size of stack reserve
00001000 size of stack commit
00100000 size of heap reserve
00001000 size of heap commit
0 [ 0] address [size] of Export Directory
1A580 [ 50] address [size] of Import Directory
1AE40 [ 348] address [size] of Resource Directory
0 [ 0] address [size] of Exception Directory
0 [ 0] address [size] of Security Directory
1B1A0 [ 1084] address [size] of Base Relocation Directory
420 [ 1C] address [size] of Debug Directory
0 [ 0] address [size] of Description Directory
0 [ 0] address [size] of Special Directory
0 [ 0] address [size] of Thread Storage Directory
0 [ 0] address [size] of Load Configuration Directory
0 [ 0] address [size] of Bound Import Directory
2C0 [ 15C] address [size] of Import Address Table Directory
Trang 100 [ 0] address [size] of Delay Import Directory
0 [ 0] address [size] of COR20 Header Directory
0 [ 0] address [size] of Reserved Directory
2: kd> db f5022000+1AE40 f5022000+1AE40+348
f503ce40
f503ce50
f503ce60 .0
f503ce70
f503ce80 H
f503ce90 .4 V f503cea0 S._.V.E.R.S.I.O f503ceb0 N._.I.N.F.O
f503cec0
f503ced0 .?
f503cee0
f503cef0 P S.t.r f503cf00 i.n.g.F.i.l.e.I f503cf10 n.f.o , 0
f503cf20 4.0.9.0.4.b.0
f503cf30 4 C.o.m.p.a f503cf40 n.y.N.a.m.e
f503cf50 M.y.C.o.m.p .A f503cf60 G p.$ F.i.l f503cf70 e.D.e.s.c.r.i.p f503cf80 t.i.o.n M.y f503cf90 B.i.g .P.r.o f503cfa0 d.u.c.t .H.o.o f503cfb0 k
f503cfc0
f503cfd0 4 F.i.l f503cfe0 e.V.e.r.s.i.o.n f503cff0 5 1 0
f503d000 ???????????????? f503d010 ????????????????
Trang 11
Memory Leak (.NET Heap) 371
MEMORY LEAK (.NET HEAP)
Sometimes the process size constantly grows but there is no difference in the
process heap size In such cases we need to check whether the process uses Microsoft
.NET runtime (CLR) If one of the loaded modules is mscorwks.dll or mscorsvr.dll then it
is most likely Then we should check CLR heap statistics
In NET world dynamically allocated objects are garbage collected (GC) and
there-fore simple allocate-and-forget memory leaks are not possible To simulate that I
created the following C# program:
Trang 12If we run it the process size will never grow GC thread will collect and free
unreferenced Leak classes This can be seen from inspecting memory dumps taken with
userdump.exe after the start, 2, 6 and 12 minutes The GC heap never grows higher than
1Mb and the number of CLRHeapLeak.Leak and System.Byte[] objects always fluctuates
between 100 and 500 For example, on 12th minute we have the following statistics:
0:000> loadby sos mscorwks
ephemeral segment allocation context: (0x014dc53c, 0x014dd618)
segment begin allocated size
004aedb8 790d7ae4 790f7064 0x0001f580(128384)
01470000 01471000 014dd618 0x0006c618(443928)
Large object heap starts at 0x02471000
segment begin allocated size
Trang 13Memory Leak (.NET Heap) 373
However, we can make Leak objects always referenced by introducing the
follow-ing changes into the program:
private byte[] m_data;
private Leak m_prevLeak;
Then, if we run the program, we would see in Task Manager that it grows over
time Taking consecutive memory dumps after the start, 10 and 16 minutes, shows that
Win32 heap segments have always the same size:
Trang 14Segment at 013b0000 to 013c0000 (00003000 bytes committed)
but GC heap and the number of Leak and System.Byte[] objects in it were
ephemeral segment allocation context: (0x013cd804, 0x013cdff4)
segment begin allocated size
0055ee08 790d7ae4 790f7064 0x0001f580(128384)
013c0000 013c1000 013cdff4 0x0000cff4(53236)
Large object heap starts at 0x023c1000
segment begin allocated size
Trang 15Memory Leak (.NET Heap) 375
generation 2 starts at 0x013c1000
ephemeral segment allocation context: (0x0192d668, 0x0192ddc8)
segment begin allocated size
0055ee08 790d7ae4 790f7064 0x0001f580(128384)
013c0000 013c1000 0192ddc8 0x0056cdc8(5688776)
Large object heap starts at 0x023c1000
segment begin allocated size
ephemeral segment allocation context: (0x01cd3050, 0x01cd3cc0)
segment begin allocated size
0055ee08 790d7ae4 790f7064 0x0001f580(128384)
013c0000 013c1000 01cd3cc0 0x00912cc0(9514176)
Large object heap starts at 0x023c1000
segment begin allocated size
Trang 16This is not the traditional memory leak because we have the reference chain
However, uncontrolled memory growth can be considered as a memory leak too,
caused by poor application design, bad input validation or error handling, etc
There are situations when customers think there is a memory leak but it is not
One of them is unusually big size of a process when running it on a multi-processor
server If dllhost.exe hosting typical NET assembly DLL occupies less than 100Mb on a
local workstation starts consuming more than 300Mb on a 4 processor server than it can
be the case that the server version of CLR uses per processor GC heaps:
0:000> loadby sos mscorsvr
0:000> !EEHeap -gc
generation 0 starts at 0×05c80154
generation 1 starts at 0×05c7720c
generation 2 starts at 0×102d0030
generation 0 starts at 0×179a0444
generation 1 starts at 0×1799b7a4
GC Heap Size 0×109702ec(278332140)
or if this is CLR 1.x the old extension will tell us the same too:
generation 0 starts at 0x179a0444
generation 1 starts at 0x1799b7a4
Trang 17Memory Leak (.NET Heap) 377
-GC Heap Size 0x1136ecf4(288,812,276)
The more processors we have the more heaps are contributing to the overall VM
size Although the process occupies almost 400Mb if it doesn’t grow constantly over
time beyond that value then it is normal
Trang 18DOUBLE FREE (PROCESS HEAP)
Double-free bugs lead to Dynamic Memory Corruption pattern (page 257) The
reason why Double Free deserves its own pattern name is the fact that either debug
runtime libraries or even OS itself detect such bugs and save crash dumps immediately
For some heap implementations double free doesn’t lead to an immediate heap
corruption and subsequent crash For example, if we allocate 3 blocks in a row and then
free the middle one twice there will be no crash as the second free call is able to detect
that the block was already freed and does nothing The following program loops forever
and never crashes:
Trang 19Double Free (Process Heap) 379
The output of the program:
However if a free call triggered heap coalescence (adjacent free blocks form the
bigger free block) then we have a heap corruption crash on the next double-free call
because the coalescence triggered by the previous free call erased free block
Trang 200012fe8c 76ee1c21 ntdll!RtlpFreeHeap+0x1e2
0012fea8 758d7a7e ntdll!RtlFreeHeap+0x14e
This is illustrated on the following picture where free calls result in heap
coales-cence and the subsequent double-free call corrupts the heap:
Trang 21Double Free (Process Heap) 381
The problem here is that heap coalescence can be triggered some time after the
double free so we need some solution to diagnose double-free bugs earlier, ideally at
the first double-free call For example, the following program crashes during the normal
free operation long after the first double-free happened:
Trang 23Double Free (Process Heap) 383
If we enable full page heap using gflags.exe from Debugging Tools for Windows
the program crashes immediately on the double free call:
Trang 240012f9ac 76ee0e97 ntdll!ExecuteHandler+0x24
0012f9ac 71aaa3ad ntdll!KiUserExceptionDispatcher+0xf
0012fcf0 71aaa920 verifier!AVrfpDphCheckNormalHeapBlock+0x5d
0012fd0c 71aa879b verifier!AVrfpDphNormalHeapFree+0x20
0012fd60 76f31c8f verifier!AVrfDebugPageHeapFree+0x1cb
0012fda8 76efd9fa ntdll!RtlDebugFreeHeap+0x2f
0012fe9c 76ee1c21 ntdll!RtlpFreeHeap+0x5f
0012feb8 758d7a7e ntdll!RtlFreeHeap+0x14e
Current NtGlobalFlag contents: 0x02000000
hpa - Place heap allocations at ends of pages
If we enable heap free checking instead of page heap we get our crash on the
first double free call immediately too:
Trang 25Double Free (Process Heap) 385
Trang 260:000> kL
ChildEBP RetAddr
0012fe9c 76ee18c3 ntdll!RtlpLowFragHeapFree+0x31
0012feb0 758d7a7e ntdll!RtlFreeHeap+0x101
Current NtGlobalFlag contents: 0x00000020
hfc - Enable heap free checking
Trang 27Double Free (Kernel Pool) 387
DOUBLE FREE (KERNEL POOL)
In contrast to Double Free pattern (page 378) in a user mode process heap
double free in a kernel mode pool results in an immediate bugcheck in order to
iden-tify the driver causing the problem (BAD_POOL_CALLER bugcheck with Arg1 == 7):
The current thread is making a bad pool request Typically this is at a
bad IRQL level or double freeing the same allocation, etc
Arguments:
Arg1: 00000007, Attempt to free pool which was already freed
Arg2: 0000121a, (reserved)
Arg3: 02140001, Memory contents of the pool block
Arg4: 89ba74f0, Address of the block of pool being deallocated
If we look at the block being deallocated we would see that it was marked as
“Free” block:
2: kd> !pool 89ba74f0
Pool page 89ba74f0 region is Nonpaged pool
89ba7000 size: 270 previous size: 0 (Allocated) Thre (Protected)
89ba7270 size: 8 previous size: 270 (Free)
89ba7278 size: 18 previous size: 8 (Allocated) ReEv
89ba7290 size: 80 previous size: 18 (Allocated) Mdl
89ba7310 size: 80 previous size: 80 (Allocated) Mdl
89ba7390 size: 30 previous size: 80 (Allocated) Vad
89ba73c0 size: 98 previous size: 30 (Allocated) File (Protected)
89ba7458 size: 8 previous size: 98 (Free) Wait
89ba7460 size: 28 previous size: 8 (Allocated) FSfm
89ba74a0 size: 40 previous size: 18 (Allocated) Ntfr
89ba74e0 size: 8 previous size: 40 (Free) File
*89ba74e8 size: a0 previous size: 8 (Free ) *ABCD
Owning component : Unknown (update pooltag.txt)
89ba7588 size: 38 previous size: a0 (Allocated) Sema (Protected)
89ba75c0 size: 38 previous size: 38 (Allocated) Sema (Protected)
89ba75f8 size: 10 previous size: 38 (Free) Nbtl
89ba7608 size: 98 previous size: 10 (Allocated) File (Protected)
89ba76a0 size: 28 previous size: 98 (Allocated) Ntfn
89ba76c8 size: 40 previous size: 28 (Allocated) Ntfr
89ba7708 size: 28 previous size: 40 (Allocated) NtFs
89ba7730 size: 40 previous size: 28 (Allocated) Ntfr
89ba7770 size: 40 previous size: 40 (Allocated) Ntfr
89ba7a10 size: 270 previous size: 260 (Allocated) Thre (Protected)
89ba7c80 size: 20 previous size: 270 (Allocated) VadS