However there is so called session space in multi-user terminal services environments where different users can use different display drivers, for example: MS RDP users - RDPDD.DLL
Trang 1kd> kL
Child-SP RetAddr Call Site
fffffadf`dfcf19b8 fffffadf`dfee38c4 nt!KeBugCheck
fffffadf`dfcf19c0 fffff800`012ce9cf userdump!UdIoctl+0x104
fffffadf`dfcf1a70 fffff800`012df026 nt!IopXxxControlFile+0xa5a
fffffadf`dfcf1b90 fffff800`010410fd nt!NtDeviceIoControlFile+0x56
fffffadf`dfcf1c00 00000000`77ef0a5a nt!KiSystemServiceCopyEnd+0x3
00000000`01eadd58 00000001`0000a755 ntdll!NtDeviceIoControlFile+0xa
00000000`01eadd60 00000000`77ef30a5
userdump_100000000!UdServiceWorkerAPC+0x1005
00000000`01eaf970 00000000`77ef0a2a ntdll!KiUserApcDispatcher+0x15
00000000`01eafe68 00000001`00007fe2 ntdll!NtWaitForSingleObject+0xa
This might be useful if we want to see kernel data that happened to be at the
exception time In this case we can avoid requesting complete memory dump of
physi-cal memory and ask for kernel memory dump only together with a user dump
Note: do not set this option if you are unsure It can have your production servers
bluescreen in the case of false positive dumps.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 2CF
Bugcheck CF name is the second longest one:
TERMINAL_SERVER_DRIVER_MADE_INCORRECT_MEMORY_REFERENCE (cf)
Arguments:
Arg1: a020b1d4, memory referenced
Arg2: 00000000, value 0 = read operation, 1 = write operation
Arg3: a020b1d4, If non-zero, the instruction address which referenced the
bad memory
address
Arg4: 00000000, Mm internal code
A driver has been incorrectly ported to Terminal Server It is
referencing session space addresses from the system process
context Probably from queueing an item to a system worker thread The
broken driver's name is displayed on the screen
Although bugcheck explanation mentions only system process context it can also
happen in an arbitrary process context Recall that kernel space address mapping is
usually considered as persistent where virtual-to-physical mapping doesn’t change
be-tween switching threads that belong to different processes However there is so called
session space in multi-user terminal services environments where different users can
use different display drivers, for example:
MS RDP users - RDPDD.DLL
Citrix ICA users - VDTW30.DLL
Vista users - TSDDD.DLL
Console user - Some H/W related video driver like ATI or NVIDIA
These drivers are not committed at the same time persistently since OS boot
al-though their module addresses might remain fixed Therefore when a new user session
is created the appropriate display driver corresponding to terminal services protocol is
loaded and mapped to the so called session space starting from A0000000 (x86) or
FFFFF90000000000 (x64) after win32k.sys address range (on first usage) and then
committed to physical memory by proper PTE entries in process page tables During
thread switch, if the new process context belongs to a different session with a different
display driver, the current display driver is decommitted by clearing its PTEs and the new
driver is committed by setting its proper PTE entries
Therefore in the system process context like worker threads virtual addresses
corresponding to display driver code and data might be unknown This can also
hap-pen in an arbitrary process context if we access the code that belongs to a display driver
Trang 3that doesn’t correspond to the current session protocol This can be illustrated with the
following example where TSDD can be either RDP or ICA display driver
In the list of loaded modules we can see that ATI and TSDD drivers are loaded:
0: kd> lm
start end module name
77d30000 77d9f000 RPCRT4 (deferred)
77e10000 77e6f000 USER32 (deferred)
77f40000 77f7c000 GDI32 (deferred)
77f80000 77ffc000 ntdll (pdb symbols) 78000000 78045000 MSVCRT (deferred)
7c2d0000 7c335000 ADVAPI32 (deferred)
7c340000 7c34f000 Secur32 (deferred)
7c570000 7c624000 KERNEL32 (deferred)
7cc30000 7cc70000 winsrv (deferred)
80062000 80076f80 hal (deferred)
80400000 805a2940 nt (pdb symbols) a0000000 a0190ce0 win32k (pdb symbols) a0191000 a01e8000 ati2drad (deferred)
a01f0000 a0296000 tsdd (no symbols)
b4a60000 b4a72320 naveng (deferred)
b4a73000 b4b44c40 navex15 (deferred)
… … … The bugcheck happens in 3rd-partyApp process context running inside some terminal session: PROCESS_NAME: 3rd-partyApp.exe TRAP_FRAME: b475f84c (.trap 0xffffffffb475f84c) ErrCode = 00000000 eax=a020b1d4 ebx=00000000 ecx=04e0443b edx=ffffffff esi=a21b6778 edi=a201b018 eip=a020b1d4 esp=b475f8c0 ebp=b475f900 iopl=3 nv up ei pl zr na pe nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00013246 TSDD+0×1b1d4: a020b1d4 ?? ???
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 4Examining display driver virtual address shows that it is unknown (PTE is NULL):
0: kd> !pte a020b1d4
A020B1D4 - PDE at C0300A00 PTE at C028082C
contains 14AB6863 contains 00000000
pfn 14ab6 –DA–KWV not valid
ATI display driver address is unknown too:
0: kd> !pte a0191000
A0191000 - PDE at C0300A00 PTE at C0280644
contains 3E301863 contains 00000000
pfn 3e301 –DA–KWV not valid
Let’s switch to another terminal session :
PROCESS 87540a60 SessionId: 45 Cid: 3954 Peb: 7ffdf000 ParentCid:
0164
DirBase: 2473d000 ObjectTable: 889b2c48 TableSize: 182
Image: csrss.exe
0: kd> process /r /p 87540a60
Implicit process is now 87540a60
Loading User Symbols
but ATI driver address is not It is unknown and this is expected because no real
display hardware is used:
Trang 5Let’s switch to session 0 where the display is “physical”:
PROCESS 8898a5e0 SessionId: 0 Cid: 0180 Peb: 7ffdf000 ParentCid:
0164
DirBase: 14c58000 ObjectTable: 8898b948 TableSize: 1322
Image: csrss.exe
0: kd> process /r /p 8898a5e0
Implicit process is now 8898a5e0
Loading User Symbols
TSDD driver address is unknown and this is expected too because we no longer
use terminal services protocol:
0: kd> !pte a020b1d4
A020B1D4 - PDE at C0300A00 PTE at C028082C
contains 14AB6863 contains 00000000
pfn 14ab6 –DA–KWV not valid
However ATI display driver addresses are not unknown (not NULL) and their 2
se-lected pages are either in transition or in a page file:
0: kd> !pte a0191000
A0191000 - PDE at C0300A00 PTE at C0280644
contains 14AB6863 contains 156DD882
pfn 14ab6 –DA–KWV not valid
Transition: 156dd
Protect: 4
0: kd> !pte a0198000
A0198000 - PDE at C0300A00 PTE at C0280660
contains 14AB6863 contains 000B9060
pfn 14ab6 –DA–KWV not valid
Trang 7MANUAL STACK TRACE RECONSTRUCTION
This is a small case study to complement Incorrect Stack Trace pattern (page
288) and show how to reconstruct stack trace manually based on an example
with complete source code
For it I created a small working multithreaded program:
#include "stdafx.h"
#include <stdio.h>
#include <process.h>
typedef void (*REQ_JUMP)();
typedef void (*REQ_RETURN)();
// Uncomment memcpy to crash the program
// Overwrite f_jmp and f_ret with NULL
Trang 8void internal_func_1(void *param)
I had to disable optimizations in Visual C++ compiler otherwise most of the code
would have been eliminated because the program is very small and easy for code
opti-mizer If we run the program it displays the following output:
internal_func_2 gets two parameters: the function address to jump and the
func-tion address to call upon the return The latter sets loop variable to false in order to
break infinite main thread loop and calls _endthread Why we need this complexity in
the small sample? Because I wanted to simulate FPO optimization in an inner function
call and also gain control over a return address This is why I set EBP to zero before
jumping and pushed the custom return address which I can change any time If I used
the call instruction then the processor would have determined the return address as the
next instruction address
The code also copies two internal_func_2 parameters into local variables f_jmp
and f_ret because the commented memcpy call is crafted to overwrite them with zeroes
and do not touch the saved EBP, return address and function arguments This is all to
make the stack trace incorrect but at the same time to make manual stack
reconstruc-tion as easy as possible in this example
Trang 9Let’s suppose that memcpy call is a bug that overwrites local variables Then we
obviously have a crash because EAX is zero and jump to zero address will cause access
violation EBP is also 0 because we assigned 0 to it explicitly Let’s pretend that we
wanted to pass some constant via EBP and it is zero
What we have now:
EBP is 0
EIP is 0
Return address is 0
When we load a crash dump WinDbg is utterly confused because it has no clue
on how to reconstruct the stack trace:
This dump file has an exception of interest stored in it
The stored exception information can be accessed via ecxr
(bd0.ec8): Access violation – code c0000005 (first/second chance not
ChildEBP RetAddr Args to Child
WARNING: Frame IP not in any known module Following frames may be wrong
Trang 10This is based on the fact that a function call saves its return address and the
stan-dard function prolog saves the previous EBP value and sets ESP to point to it
Push ebp
mov ebp, esp
Trang 11Therefore our stack looks like this:
We also double check return addresses to see if they are valid code indeed The
best way is to disassemble them backwards This should show call instructions resulted
in saved return addresses:
0:001> ub WrongIP!internal_func_1+0x1f
WrongIP!internal_func_1+0x1:
00401871 mov ebp,esp
00401873 push offset WrongIP!GS_ExceptionPointers+0x38 (00402124)
00401878 call dword ptr [WrongIP!_imp puts (004020ac)]
0040187e add esp,4
00401881 push offset WrongIP!return_func (00401850)
00401886 mov eax,dword ptr [ebp+8]
00401889 push eax
0040188a call WrongIP!internal_func_2 (004017e0)
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 12004018a0 push ebp
004018a1 mov ebp,esp
004018a3 mov eax,dword ptr [ebp+8]
004018a6 push eax
004018a7 call WrongIP!internal_func_1 (00401870)
78132839 call msvcr80!_getptd (78132e29)
7813283e and dword ptr [ebp-4],0
78132842 push dword ptr [eax+58h]
78132845 call dword ptr [eax+54h]
0:001> ub msvcr80!_endthread+0xcb
msvcr80!_endthread+0xaf:
781328ac mov edx,dword ptr [ecx+58h]
781328af mov dword ptr [eax+58h],edx
781328b2 mov edx,dword ptr [ecx+4]
781328b5 push ecx
781328b6 mov dword ptr [eax+4],edx
781328b9 call msvcr80!_freefls (78132e41)
781328be call msvcr80!_initp_misc_winxfltr (781493c1)
781328c3 call msvcr80!_endthread+0×30 (7813282d)
0:001> ub BaseThreadStart+0x34
kernel32!BaseThreadStart+0x10:
7d4dfdfd mov eax,dword ptr fs:[00000018h]
7d4dfe03 cmp dword ptr [eax+10h],1E00h
7d4dfe0a jne kernel32!BaseThreadStart+0x2e (7d4dfe1b)
7d4dfe0c cmp byte ptr [kernel32!BaseRunningInServerProcess
(7d560008)],0
7d4dfe13 jne kernel32!BaseThreadStart+0x2e (7d4dfe1b)
7d4dfe15 call dword ptr [kernel32!_imp CsrNewThread (7d4d0310)]
7d4dfe1b push dword ptr [ebp+0Ch]
7d4dfe1e call dword ptr [ebp+8]
Now we can use extended version of k command and supply custom EBP, ESP
and EIP values We set EBP to the first found address of EBP:PreviousEBP pair and set
EIP to 0:
Trang 13The stack trace looks good because it also shows BaseThreadStart From the
backwards disassembly of the return address WrongIP!internal_func_1+0×1f we see
that internal_func_1 calls internal_func_2 so we can disassemble the latter function:
0:001> uf internal_func_2
Flow analysis was incomplete, some code may be missing
WrongIP!internal_func_2:
28 004017e0 push ebp
28 004017e1 mov ebp,esp
28 004017e3 sub esp,8
29 004017e6 mov eax,dword ptr [ebp+8]
29 004017e9 mov dword ptr [ebp-4],eax
30 004017ec mov ecx,dword ptr [ebp+0Ch]
30 004017ef mov dword ptr [ebp-8],ecx
32 004017f2 push offset WrongIP!GS_ExceptionPointers+0×28 (00402114)
32 004017f7 call dword ptr [WrongIP!_imp puts (004020ac)]
35 00401813 push dword ptr [ebp-8]
36 00401816 mov eax,dword ptr [ebp-4]
37 00401819 mov ebp,0
38 0040181e jmp eax
We see that it takes some value from [ebp-8], puts it into EAX and then jumps to
that address The function uses standard prolog (in blue) and therefore EBP-4 is the local
variable From the code we see that it comes from [EBP+8] which is the first function
parameter:
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Trang 14EBP+C: second parameter
EBP+8: first parameter
EBP+4: return address
EBP: previous EBP
EBP-4: local variable
EBP-8: local variable
If we examine the first parameter we would see that it is the valid function
ad-dress that we were supposed to call:
0:001> kv L=0069ff60 0069ff60 0
ChildEBP RetAddr Args to Child
WARNING: Frame IP not in any known module Following frames may be wrong
00401833 push offset WrongIP!GS_ExceptionPointers+0x1c (00402108)
00401838 call dword ptr [WrongIP!_imp puts (004020ac)]
0040183e add esp,4
00401841 pop ebp
00401842 ret
00401843 int 3
However if we look at the code we would see that we call memcpy with EBP-8
address and the number of bytes to copy is 8 In pseudo-code it would look like:
Trang 15Therefore MEMCPY overwrites our local variables that contain a jump address
with zeroes This explains why we have jumped to 0 address and why EIP was zero
Finally our reconstructed stack trace looks like this:
WrongIP!internal_func_2+offset ; here we jump
This was based on the fact that ESP was valid If we have zero or invalid ESP we
can look at the entire raw stack range from the thread environment block (TEB) Use
!teb command to get thread stack range In my example this command doesn’t work
due to the lack of proper MS symbols but it reports TEB address and we can dump it:
0:001> !teb
TEB at 7efda000
error InitTypeRead( TEB )…
0:001> dd 7efda000 l3
7efda000 0069ffa4 006a0000 0069e000
Usually the second double word is the stack limit and the third is the stack base
address so we can dump the range and start reconstructing stack trace for our example
from the bottom of the stack (BaseThreadStart) or look after exception handling
calls (shown in bold):
Trang 17WINDBG TIPS AND TRICKS
LOOKING FOR STRINGS IN A DUMP
There are wonderful WinDbg commands dpu (UNICODE strings) and dpa (ASCII
strings) and other d** equivalents like dpp For example, we can examine raw stack
data and check if any pointers on stack are pointing to strings For example:
Of course, we can apply these commands to any memory range, not only stack
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.