Stack Trace 397 Browsing search results reveals the following discussion: http://groups.google.com/group/microsoft.public.inetserver.iis/browse_frm/thr ead/34bc2be635b26531?tvc=1 which
Trang 1Coincidental Symbolic Information 391
However this is the pure coincidence The data pattern 00NN00NN clearly
be-longs to a Unicode string:
0:020> du 00b1ed00
00b1ed00 "ocument Loader [UPD:PCL5c]"
It just happens that 00430050 value can be interpreted as an address that falls
into Application module address range and its code section:
0:020> lm
start end module name
00400000 0044d000 Application
In the second example, the crash dump is from some 3rd-party application called
AppSql for which we don’t have PDB files Also we know that myhook.dll is installed as a
system wide hook and it had some problems in the past It is loaded into any address
space but is not necessarily used We want to see if there are traces of it on the problem
thread stack Dumping stack contents shows us the only one reference:
Trang 2The address 10008e00 looks very “round” and it might be the set of bit flags and
also, if we disassemble the code at this address backwards, we don’t see the usual call
instruction that saved that address on the stack:
0:000> ub 10008e00
myhook!notify_me+0x211
10008de5 81c180000000 add ecx,80h
10008deb 899578ffffff mov dword ptr [ebp-88h],edx
10008df1 89458c mov dword ptr [ebp-74h],eax
10008df4 894d98 mov dword ptr [ebp-68h],ecx
10008df7 6a01 push 1
10008df9 8d45ec lea eax,[ebp-14h]
10008dfc 50 push eax
10008dfd ff75e0 push dword ptr [ebp-20h]
In contrast, the other two addresses are return addresses saved on the stack:
0:000> ub 0066a831
AppSql+0x26a81e:
0066a81e 8bfb mov edi,ebx
0066a820 f3a5 rep movs dword ptr es:[edi],dword ptr [esi]
0066a822 8bca mov ecx,edx
0066a824 83e103 and ecx,3
0066a827 f3a4 rep movs byte ptr es:[edi],byte ptr [esi]
0066a829 8b00 mov eax,dword ptr [eax]
0066a82b 50 push eax
0066a82c e8affeffff call AppSql+0×26a6e0 (0066a6e0)
Trang 3Coincidental Symbolic Information 393
0:000> ub 0049e180
AppSql+0x9e16f:
0049e16f cc int 3
0049e170 55 push ebp
0049e171 8bec mov ebp,esp
0049e173 8b4510 mov eax,dword ptr [ebp+10h]
0049e176 8b4d0c mov ecx,dword ptr [ebp+0Ch]
0049e179 50 push eax
0049e17a 51 push ecx
0049e17b e840c61c00 call AppSql+0×26a7c0 (0066a7c0)
Therefore the appearance of myhook!notify_me+0×22c could be a
coinci-dence unless it was a pointer to a function However, if it was the function pointer
ad-dress then it wouldn’t have pointed to the middle of the function call sequence that
pushes arguments:
0:000> ub 10008e00
myhook!notify_me+0x211
10008de5 81c180000000 add ecx,80h
10008deb 899578ffffff mov dword ptr [ebp-88h],edx
10008df1 89458c mov dword ptr [ebp-74h],eax
10008df4 894d98 mov dword ptr [ebp-68h],ecx
10008e00 e82ff1ffff call myhook!copy_data (10007f34)
10008e05 8b8578ffffff mov eax,dword ptr [ebp-88h]
10008e0b 3945ac cmp dword ptr [ebp-54h],eax
10008e0e 731f jae myhook!notify_me+0×25b (10008e2f)
10008e10 8b4598 mov eax,dword ptr [ebp-68h]
10008e13 0fbf00 movsx eax,word ptr [eax]
10008e16 8945a8 mov dword ptr [ebp-58h],eax
10008e19 8b45e0 mov eax,dword ptr [ebp-20h]
Also, because we have source code and private symbols, we know that if it was a
function pointer then it would have been myhook!notify_me address and not
notify_me+0×22c address
All this evidence supports the hypothesis that myhook occurrence on the
prob-lem stack is just the coincidence and should be ignored
To add, the most coincidental symbolic information I have found so far in one
crash dump is accidental correspondence between exported _DebuggerHookData and
the location of the postmortem debugger NTSD:
Trang 5Stack Trace 395
STACK TRACE
The most important pattern that is used for problem identification and resolution
is Stack Trace Consider the following fragment of !analyze -v output from w3wp.exe
1824ffec 00000000 791fe91b 17bb9c18 00000000 kernel32!BaseThreadStart+0x34
Ignoring the first 5 numeric columns gives us the following trace:
Trang 6In general we have something like this:
Sometimes function names are not available or offsets are very big like 0×2380 If
this is the case then we probably don’t have symbol files for moduleA and moduleB:
Usually there is some kind of a database of previous issues we can use to
match moduleA!functionX+offsetN against If there is no such match we can try
functionX+offsetN, moduleA!functionX or just functionX If there is no such match again
we can try the next signature, moduleB!functionY+offsetM, and moduleB!functionY, etc
Usually, the further down the trace the less useful the signature is for problem
resolu-tion For example, mscorsvr!ThreadpoolMgr::WorkerThreadStart+0x129 will probably
match many issues because this signature is common for many ASP.NET applications
If there is no match in internal databases we can try Google For our
exam-ple, Google search for SendResponseHeaders+0x5d gives the following search results:
Trang 7Stack Trace 397
Browsing search results reveals the following discussion:
http://groups.google.com/group/microsoft.public.inetserver.iis/browse_frm/thr
ead/34bc2be635b26531?tvc=1
which can be found directly by searching Google groups:
Another example is from BSOD complete memory dump Analysis command has
the following output:
MODE_EXCEPTION_NOT_HANDLED (1e)
This is a very common bugcheck Usually the exception address pinpoints
the driver/function that caused the problem Always note this address as
well as the link date of the driver/image that contains this address
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: bff90ca3, The address that the exception occurred at
Arg3: 00000000, Parameter 0 of the exception
Arg4: 00000000, Parameter 1 of the exception
bff90ca3 8b08 mov ecx,dword ptr [eax] ds:0023:00000000=????????
Resetting default scope
Trang 8bff81000 bff987c0 tsmlvsa (no symbols)
Loaded symbol image file: tsmlvsa.sys
Image path: tsmlvsa.sys
Image name: tsmlvsa.sys
Timestamp: Thu Mar 18 06:18:51 2004 (40593F4B)
CheckSum: 0002D102
ImageSize: 000177C0
Translations: 0000.04b0 0000.04e0 0409.04b0 0409.04e0
Google search for tsmlvsa+0xfca3 fails but if we search just for tsmlvsa we get
the first link towards problem resolution:
http://www-1.ibm.com/support/docview.wss?uid=swg1IC40964
Trang 9Stack Trace 399
Trang 10VIRTUALIZED PROCESS (WOW64)
Sometimes we get a process dump from x64 Windows and when we load it into
WinDbg we get the output telling us that an exception or a breakpoint comes from
wow64.dll For example:
Loading Dump File [X:\ppid2088.dmp]
User Mini Dump File with Full Memory: Only application data is available
Comment: 'Userdump generated complete user-mode minidump with Exception
Monitor function on SERVER01'
Symbol search path is:
srv*c:\mss*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows Server 2003 Version 3790 (Service Pack 2) MP (4 procs) Free x64
Product: Server, suite: TerminalServer
Debug session time: Tue Sep 4 13:36:14.000 2007 (GMT+2)
System Uptime: 6 days 3:32:26.081
Process Uptime: 0 days 0:01:54.000
WARNING: tsappcmp overlaps ws2_32
WARNING: msvcp60 overlaps oleacc
WARNING: tapi32 overlaps rasapi32
WARNING: rtutils overlaps rasman
WARNING: dnsapi overlaps rasapi32
WARNING: wldap32 overlaps dnsapi
WARNING: ntshrui overlaps userenv
WARNING: wtsapi32 overlaps dnsapi
WARNING: winsta overlaps setupapi
WARNING: activeds overlaps rtutils
WARNING: activeds overlaps rasman
WARNING: adsldpc overlaps activeds
WARNING: drprov overlaps apphelp
WARNING: netui1 overlaps netui0
WARNING: davclnt overlaps apphelp
This dump file has an exception of interest stored in it
The stored exception information can be accessed via ecxr
(2088.2fe4): Unknown exception - code 000006d9 (first/second chance not
available)
wow64!Wow64NotifyDebugger+0×9:
00000000`6b006369 b001 mov al,1
Analysis shows that the run-time exception was raised but the stack trace shows
only WOW64 CPU simulation code in all process threads:
Trang 11Virtualized Process (WOW64) 401
ERROR_CODE: (NTSTATUS) 0x6d9 - There are no more endpoints available from
the endpoint mapper
Trang 120 Id: 2088.2fe4 Suspend: 1 Teb: 00000000`7efdb000 Unfrozen
Child-SP RetAddr Call Site
00000000`0016e190 00000000`6b0064f2 wow64!Wow64NotifyDebugger+0x9
00000000`0016e1c0 00000000`6b006866 wow64!Wow64KiRaiseException+0x172
00000000`0016e530 00000000`78b83c7d wow64!Wow64SystemServiceEx+0xd6
00000000`0016edf0 00000000`6b006a5a wow64cpu!ServiceNoTurbo+0x28
00000000`0016ee80 00000000`6b005e0d wow64!RunCpuSimulation+0xa
00000000`0016eeb0 00000000`77ed8030 wow64!Wow64LdrpInitialize+0x2ed
1 Id: 2088.280c Suspend: 1 Teb: 00000000`7efd8000 Unfrozen
Child-SP RetAddr Call Site
Trang 13Virtualized Process (WOW64) 403
2 Id: 2088.1160 Suspend: 1 Teb: 00000000`7efd5000 Unfrozen
Child-SP RetAddr Call Site
3 Id: 2088.2d04 Suspend: 1 Teb: 00000000`7efad000 Unfrozen
Child-SP RetAddr Call Site
Trang 144 Id: 2088.15c4 Suspend: 1 Teb: 00000000`7efa4000 Unfrozen
Child-SP RetAddr Call Site
00000000`02def0a8 00000000`6b006a5a wow64cpu!RemoveIoCompletionFault+0x41
00000000`02def180 00000000`6b005e0d wow64!RunCpuSimulation+0xa
This is a clear indication that the process was in fact 32-bit but the dump is
64-bit This situation is depicted in the article Dumps, Debuggers and Virtualization (page
516) and we need a debugger plug-in to understand virtualized CPU architecture
This crash dump pattern can be called Virtualized Process In our case we need
to load wow64exts.dll WinDbg extension and set the target processor mode to x86 by
using effmach command
0:000> load wow64exts
0:000> effmach x86
Effective machine: x86 compatible (x86)
Trang 15Virtualized Process (WOW64) 405
Then analysis gives us more meaningful results:
ERROR_CODE: (NTSTATUS) 0x6d9 - There are no more endpoints available from
the endpoint mapper
Trang 160012e010 76ed827f rasapi32+0x482e5
0012e044 76ed8bf0 rasapi32+0x4827f
0012e0c8 76ed844d rasapi32+0x48bf0
0012e170 76ed74b5 rasapi32+0x4844d
0012e200 76ed544f rasapi32+0x474b5
0012e22c 76ed944d rasapi32+0x4544f
0012e24c 76ed93a4 rasapi32+0x4944d
0012e298 76ed505f rasapi32+0x493a4
Trang 17Virtualized Process (WOW64) 407
0012e010 76ed827f rasapi32+0x482e5
0012e044 76ed8bf0 rasapi32+0x4827f
0012e0c8 76ed844d rasapi32+0x48bf0
0012e170 76ed74b5 rasapi32+0x4844d
0012e200 76ed544f rasapi32+0x474b5
0012e22c 76ed944d rasapi32+0x4544f
0012e24c 76ed93a4 rasapi32+0x4944d
Trang 181 Id: 2088.280c Suspend: 1 Teb: 00000000`7efd8000 Unfrozen
Trang 19Stack Trace Collection 409
STACK TRACE COLLECTI ON
Sometimes a problem can be identified not from a single Stack Trace pattern but
from a Stack Trace Collection
These include Coupled Processes (page 419), Procedure Call Chains (page 481)
and Blocked Threads (see Volume 2) Here I only discuss various methods to list stack
traces
Process dumps including various process minidumps:
~*kv command lists all process threads
!findstack module[!symbol] 2 command filters all stack traces to show ones
containing module or module!symbol
!uniqstack command
Kernel minidumps:
have only one problem thread kv command or its variant is suffice
Kernel and complete memory dumps:
!process 0 ff command lists all processes and their threads including user space
process thread stacks for complete memory dumps This command is valid for Windows
XP and later For older systems we can use WinDbg scripts
!stacks 2 [module[!symbol]] command shows kernel mode stack traces and we
can filter the output based on module or module!symbol Filtering is valid only for crash
dumps from Windows XP and later systems
~[ProcessorN]s;.reload /user;kv command sequence shows stack trace for the
running thread on the specified processor
Trang 20The processor change command is illustrated in this example:
be4f8c30 eb091f43 i8042prt!I8xProcessCrashDump+0x53
be4f8c8c 8046bfe2 i8042prt!I8042KeyboardInterruptService+0x15d
Trang 21Stack Trace Collection 411 Example of !stacks command (kernel dump):
Trang 23Stack Trace Collection 413
Trang 24What if we have a list of processes from a complete memory dump by using
!process 0 0 command and we want to interrogate the specific process? In this case we
need to switch to that process and reload user space symbol files (.process /r /p
address)
There is also a separate command to reload user space symbol files any
time (.reload /user)
After switching we can list threads (!process address), dump or search process
virtual memory For example:
1: kd> !process 0 0
**** NT ACTIVE PROCESS DUMP ****
PROCESS 890a3320 SessionId: 0 Cid: 0008 Peb: 00000000 ParentCid:
Trang 25Stack Trace Collection 415
PROCESS 890af020 SessionId: 0 Cid: 0160 Peb: 7ffdf000 ParentCid:
Implicit process is now 8893d020
Loading User Symbols
Trang 26THREAD 8893dda0 Cid 178.15c Teb: 7ffde000 Win32Thread: a2034908
WAIT: (WrUserRequest) UserMode Non-Alertable
8893bee0 SynchronizationEvent
Not impersonating
Owning Process 8893d020
Wait Start TickCount 29932455 Elapsed Ticks: 7
Context Switch Count 28087 LargeStack
UserTime 0:00:00.0023
KernelTime 0:00:00.0084
Start Address winlogon!WinMainCRTStartup (0x0101cbb0)
Stack Init eb1b0000 Current eb1afcc8 Base eb1b0000 Limit eb1ac000
Call 0
Priority 15 BasePriority 15 PriorityDecrement 0 DecrementCount 0
ChildEBP RetAddr
eb1afce0 8042d893 nt!KiSwapThread+0x1b1
eb1afd08 a00019c2 nt!KeWaitForSingleObject+0x1a3
eb1afd44 a0013993 win32k!xxxSleepThread+0x18a
eb1afd54 a001399f win32k!xxxWaitMessage+0xe
eb1afd5c 80468389 win32k!NtUserWaitMessage+0xb
eb1afd5c 77e58b53 nt!KiSystemService+0xc9
0006fdd0 77e33630 USER32!NtUserWaitMessage+0xb
0006fe04 77e44327 USER32!DialogBox2+0x216
0006fe28 77e38d37 USER32!InternalDialogBox+0xd1
0006fe48 77e39eba USER32!DialogBoxIndirectParamAorW+0x34
THREAD 88980020 Cid 178.188 Teb: 7ffdc000 Win32Thread: 00000000
WAIT: (DelayExecution) UserMode Alertable
88980108 NotificationTimer
Not impersonating
Owning Process 8893d020
Wait Start TickCount 29930810 Elapsed Ticks: 1652
Context Switch Count 15638
UserTime 0:00:00.0000
KernelTime 0:00:00.0000
Start Address KERNEL32!BaseThreadStartThunk (0x7c57b740)
Win32 Start Address ntdll!RtlpTimerThread (0x77faa02d)
Stack Init bf6f7000 Current bf6f6cc4 Base bf6f7000 Limit bf6f4000
Call 0
Priority 13 BasePriority 13 PriorityDecrement 0 DecrementCount 0