OSRLogo
OSRLogoOSRLogoOSRLogo x Seminar Ad
OSRLogo
x

Everything Windows Driver Development

x
x
x
GoToHomePage xLoginx
 
 

    Thu, 14 Mar 2019     118020 members

   Login
   Join


 
 
Contents
  Online Dump Analyzer
OSR Dev Blog
The NT Insider
The Basics
File Systems
Downloads
ListServer / Forum
  Express Links
  · The NT Insider Digital Edition - May-June 2016 Now Available!
  · Windows 8.1 Update: VS Express Now Supported
  · HCK Client install on Windows N versions
  · There's a WDFSTRING?
  · When CAN You Call WdfIoQueueP...ously

Debugging A Sound Driver

This month’s random crash dump came from a developer’s system here at OSR. He was, innocently enough, programming along and listening to a CD (on his computer, of course) when his system crashed. Of course, being kernel developers we are always looking for decent crash dumps, so his system produced one (a kernel summary dump) for him. In this article we’ll analyze this, talk about the bug we found, and our resolution of it.

Of course, by now we all know the first step in analyzing any crash dump is to use the "!analyze" command. While it is not always right, it generally provides us with the right place to start (See below).

1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003.  This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG.  This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but ...
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG.  This will let us see why this breakpoint is
happening.
An exception code of 0x80000002 (STATUS_DATATYPE_MISALIGNMENT) indicates
that an unaligned data reference was encountered.  The trap frame will
supply additional information.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: f6c32190, The address that the exception occurred at
Arg3: ba9be69c, Trap Frame
Arg4: 00000000

Debugging Details:
------------------

Debugger Dbgportaldb Connection::Open failed 80004005

Database Dbgportaldb not connected
ADO ERROR 80004005,11: [DBNETLIB][ConnectionOpen (Connect()).]SQL Server does not exist or

access denied.

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at

"0x%08lx". The memory could not be "%s".

FAULTING_IP:
smwdm+b190
f6c32190 ff37             push    dword ptr [edi]

TRAP_FRAME:  ba9be69c -- (.trap ffffffffba9be69c)
ErrCode = 00000000
eax=8675e2ec ebx=ba9be73f ecx=00000000 edx=00000000 esi=85d1b3c8 edi=00000000
eip=f6c32190 esp=ba9be710 ebp=ba9be740 iopl=0         nv up ei pl zr na po nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010246
smwdm+0xb190:
f6c32190 ff37             push    dword ptr [edi]   ds:0023:00000000=????????
Resetting default scope

DEFAULT_BUCKET_ID:  DRIVER_FAULT

BUGCHECK_STR:  0x8E

LAST_CONTROL_TRANSFER:  from f6c3164c to f6c32190

STACK_TEXT: 
WARNING: Stack unwind information not available. Following frames may be wrong.
ba9be740 f6c3164c 00000002 8675e2f8 ba9be7a4 smwdm+0xb190
ba9be760 f6c0d844 85d1b3c8 00000002 00000002 smwdm+0xa64c
ba9be780 f6c21760 8675e2f8 00000002 00000003 portcls!CPortPinWavePci::SetDeviceState+0x41
ba9be7b8 f6c21e8e 85d93c08 00000003 86470e48

portcls!CPortPinWavePci::DistributeDeviceState+0x4d
ba9be7d4 f6cdf812 86c82f48 85c3d380 85c3d378 portcls!CPortPinWavePci::DeviceIoControl+0x27f
ba9be840 f6cdf7fd 86470e48 00000004 e197f05c ks!KspPropertyHandler+0x625
ba9be860 f6c14f20 86470e48 00000004 e197f048 ks!KsPropertyHandler+0x17
ba9be870 f6c21def 86470e48 00000004 e197f048 portcls!PcHandlePropertyWithTable+0x1a
ba9be8a8 f6c14ec8 8675e2e8 86c86030 86470e48 portcls!CPortPinWavePci::DeviceIoControl+0x1e0
ba9be8c0 f6ce5633 86c86030 86470e48 ba9be90c portcls!DispatchDeviceIoControl+0x44
ba9be8d0 f6c1c7c9 86c86030 86470e48 86432d38 ks!KsDispatchIrp+0xa3
ba9be8e4 f6c1ca2c 86c86030 86470e48 f6c88940 portcls!KsoDispatchIrp+0x40
ba9be8f0 f6c88940 86c86030 86470e48 86432d38 portcls!PcDispatchIrp+0x2a
ba9be90c 804eb3c1 86c86030 86470e48 86c86e90 smwdm+0x61940
ba9be91c f7b033f3 86432d38 86c86dd8 86c87c88 nt!IopfCallDriver+0x31
ba9be968 f77a27e7 86432d38 00000000 002f0003 aeaudio+0x3f3
ba9be9b0 f77a317e 86432d38 00000000 00000002 sysaudio!PinConnectionProperty+0x40
ba9be9dc f77a3d04 00000002 00000002 00000006 sysaudio!CPinNodeInstance::SetState+0xff
ba9be9f8 f77a3df7 00000002 00000003 00000006

sysaudio!CConnectNodeInstance::SetStateBottomUp+0x24
ba9bea18 f77a3e43 00000002 00000003 00000006

sysaudio!CStartNodeInstance::SetStateBottomUp+0x27
ba9bea38 f77a224f 00000002 00000006 8682b8d0 sysaudio!CStartNodeInstance::SetState+0x69
ba9bea4c f6cdf812 e2d6b4f0 8682b8d0 8682b8c8 sysaudio!CPinInstance::PinStateHandler+0x5b
ba9beab8 f6cdf7fd 85f7ee48 00000003 f77a0098 ks!KspPropertyHandler+0x625
ba9bead8 f77a21d8 85f7ee48 00000003 f77a0098 ks!KsPropertyHandler+0x17
ba9beb2c f6cdedfe 86a479c8 85f7ee48 804eb3c1

sysaudio!CPinInstance::PinDispatchIoControl+0x115
ba9beb38 804eb3c1 86a479c8 85f7ee48 85f7ee48 ks!DispatchDeviceIoControl+0x25
ba9beb48 f6cdf002 ba901350 ba9bebb4 00000000 nt!IopfCallDriver+0x31
ba9beb74 ba8f3fb5 86e53e70 00000000 002f0003 ks!KsSynchronousIoControlDevice+0xbb
ba9bebbc ba8f356b 86e53e70 ba901340 00000000 wdmaud!PinProperty+0x49
ba9bebd8 ba8fc56f 86e53e70 00000002 86cd0efc wdmaud!StatePin+0x19
ba9bebf8 ba8f40e9 85c3688c 86cd0ee8 00000002 wdmaud!StateWavePin+0x65
ba9bec14 ba8f4045 8608ae48 85c36000 00000000 wdmaud!Dispatch_State+0x152
ba9bec40 804eb3c1 00000000 85c36000 806bb2cc wdmaud!SoundDispatch+0x2e6
ba9bec50 805644d2 8608aedc 864b0728 8608ae48 nt!IopfCallDriver+0x31
ba9bec64 805651f6 86a55978 8608ae48 864b0728 nt!IopSynchronousServiceTail+0x5e
ba9bed00 8055e288 000001cc 000004bc 00000000 nt!IopXxxControlFile+0x5a6
ba9bed34 805306a4 000001cc 000004bc 00000000 nt!NtDeviceIoControlFile+0x28
ba9bed34 7ffe0304 000001cc 000004bc 00000000 nt!KiSystemService+0xc9
0106ff48 00000000 00000000 00000000 00000000 SharedUserData!SystemCallStub+0x4


FOLLOWUP_IP:
smwdm+b190
f6c32190 ff37             push    dword ptr [edi]

SYMBOL_STACK_INDEX:  0

FOLLOWUP_NAME:  waddext

SYMBOL_NAME:  smwdm+b190

MODULE_NAME:  smwdm

IMAGE_NAME:  smwdm.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  3cf3d814

STACK_COMMAND:  .trap ffffffffba9be69c ; kb

BUCKET_ID:  0x8E_smwdm+b190

Followup: waddext
---------

As is so often the case for those of us out in the real world, we don’t have symbols for this third party driver. Thus, we have to work through this dump, doing a bit of careful analysis in order to figure out what actually happened.

Generally, when I approach a dump like this I focus on the trap frame, since that is the point at which the original exception arose. In this case the trap frame points to our faulting instruction (ah, but we got that from the !analyze in the first place):

f6c32190 ff37             push    dword ptr [edi]   ds:0023:00000000=????????

Thus, the data contents of EDI do not point to a valid memory location. Of course, by itself this is not enough to tell us why this pointer was invalid in the first place. To do that, we want to see the sequence of instructions leading up to this point. Of course, without symbols, we don’t know where this function even started, so we can either back up a random amount or we can figure out where the function started.

To determine where the current function started, we can look at the address called by the previous frame. Eventually, that will lead to a call into our current function - while often it is the direct call, because of optimization we might have missed some call frames.

In this case the previous call frame indicates the call is from f6c31647:

f6c31647 e8f80b0000       call    smwdm+0xb244 (f6c32244)
f6c3164c 83be1006000001   cmp     dword ptr [esi+0x610],0x1

We found this by using the Calls window in the debugger to set the focus to the previous call frame. From this we can begin tracing from this point. Notice, however, that this address is above the address of the fault - suggesting that in fact there is some subsequent call to the appropriate function. To confirm this we need to look through the code flow in order to either find a jump or call to the lower address.

For this particular case it takes a bit of time because there are several calls, each of which must be further traced down. But we must start by disassembling from the address we discovered:

1: kd> u f6c32244
smwdm+0xb244:
f6c32244 55               push    ebp
f6c32245 8bec             mov     ebp,esp
f6c32247 83ec0c           sub     esp,0xc
f6c3224a 56               push    esi
f6c3224b 57               push    edi
f6c3224c 8bf1             mov     esi,ecx
f6c3224e 33ff             xor     edi,edi
f6c32250 6639bea4000000   cmp     [esi+0xa4],di
1: kd> u
smwdm+0xb257:
f6c32257 0f848a010000     je      smwdm+0xb3e7 (f6c323e7)
f6c3225d 8b4618           mov     eax,[esi+0x18]
f6c32260 83be1006000001   cmp     dword ptr [esi+0x610],0x1
f6c32267 8b4024           mov     eax,[eax+0x24]
f6c3226a 8945f4           mov     [ebp-0xc],eax
f6c3226d 740f             jz      smwdm+0xb27e (f6c3227e)
f6c3226f 8d8ebc000000     lea     ecx,[esi+0xbc]
f6c32275 ff150cdbc6f6     call    dword ptr [smwdm+0x46b0c (f6c6db0c)]

The call here is a call into an OS function (calls through a table generally are using the import table for the driver). This is easily seen using the debugger:

1: kd> dd f6c6db0c l1
f6c6db0c  806bb6b0
1: kd> u @$p
hal!KfAcquireSpinLock:

Thus, this is a spin lock acquisition. We resume disassembly of our original function:

1: kd> u f6c32275
smwdm+0xb275:
f6c32275 ff150cdbc6f6     call    dword ptr [smwdm+0x46b0c (f6c6db0c)]
f6c3227b 8845ff           mov     [ebp-0x1],al
f6c3227e 53               push    ebx
f6c3227f 0fb69ef8000000   movzx   ebx,byte ptr [esi+0xf8]
f6c32286 8b466c           mov     eax,[esi+0x6c]
f6c32289 83c004           add     eax,0x4
f6c3228c 50               push    eax
f6c3228d ff75f4           push    dword ptr [ebp-0xc]
1: kd> u
smwdm+0xb290:
f6c32290 e8c5deffff       call    smwdm+0x915a (f6c3015a)
f6c32295 0fb6c0           movzx   eax,al
f6c32298 3bd8             cmp     ebx,eax
f6c3229a 8945f8           mov     [ebp-0x8],eax
f6c3229d 0f8490000000     je      smwdm+0xb333 (f6c32333)
f6c322a3 6639bea4000000   cmp     [esi+0xa4],di
f6c322aa 767e             jbe     smwdm+0xb32a (f6c3232a)
f6c322ac 80beb800000000   cmp     byte ptr [esi+0xb8],0x0

Notice the call to f6c3015a. This call should be further traced (which we actually did in our analysis) but we’ll omit this for brevity. Thus, we continue our disassembly from this point:

1: kd> u f6c322ac
smwdm+0xb2ac:
f6c322ac 80beb800000000   cmp     byte ptr [esi+0xb8],0x0
f6c322b3 7445             jz      smwdm+0xb2fa (f6c322fa)
f6c322b5 6a01             push    0x1
f6c322b7 8bcb             mov     ecx,ebx
f6c322b9 58               pop     eax
f6c322ba d3e0             shl     eax,cl
f6c322bc 8586b4000000     test    [esi+0xb4],eax
f6c322c2 7436             jz      smwdm+0xb2fa (f6c322fa)
1: kd> u
smwdm+0xb2c4:
f6c322c4 66ff8ea6000000   dec     word ptr [esi+0xa6]
f6c322cb 8d8ee0000000     lea     ecx,[esi+0xe0]
f6c322d1 8b11             mov     edx,[ecx]
f6c322d3 897cda04         mov     [edx+ebx*8+0x4],edi
f6c322d7 8b09             mov     ecx,[ecx]
f6c322d9 893cd9           mov     [ecx+ebx*8],edi
f6c322dc f7d0             not     eax
f6c322de 2186b4000000     and     [esi+0xb4],eax
1: kd> u
smwdm+0xb2e4:
f6c322e4 43               inc     ebx
f6c322e5 83e31f           and     ebx,0x1f
f6c322e8 6639bea4000000   cmp     [esi+0xa4],di
f6c322ef 7530             jnz     smwdm+0xb321 (f6c32321)
f6c322f1 80a6b900000000   and     byte ptr [esi+0xb9],0x0
f6c322f8 eb27             jmp     smwdm+0xb321 (f6c32321)
f6c322fa 83be1006000001   cmp     dword ptr [esi+0x610],0x1
f6c32301 7411             jz      smwdm+0xb314 (f6c32314)
1: kd> u
smwdm+0xb303:
f6c32303 8d45ff           lea     eax,[ebp-0x1]
f6c32306 8bce             mov     ecx,esi
f6c32308 50               push    eax
f6c32309 0fb7c3           movzx   eax,bx
f6c3230c 50               push    eax
f6c3230d e83afeffff       call    smwdm+0xb14c (f6c3214c)
f6c32312 eb0b             jmp     smwdm+0xb31f (f6c3231f)
f6c32314 0fb7c3           movzx   eax,bx

And now we see the call back to the address range we were seeking - in this case f6c3214c. We could have also obtained this information by manually examining the stack:

1: kd> dd ba9be710
ba9be710  00000000 85d1b3c8 00000001 f6c32312
ba9be720  00000001 ba9be73f 00000000 00000000
ba9be730  85d1b3c8 86c16000 00000003 00d1b314
ba9be740  ba9be760 f6c3164c 00000002 8675e2f8
ba9be750  ba9be7a4 85d1b448 86c16000 f6c15264
ba9be760  ba9be780 f6c0d844 85d1b3c8 00000002
ba9be770  00000002 8675e2f8 00000000 00000000
ba9be780  ba9be7b8 f6c21760 8675e2f8 00000002

What we were looking for is that return address (f6c32312). From this we can back up in the code stream to find the jump address. While not strictly required, we generally observe that call instructions are generated as five byte opcodes (four bytes for the relative jump offset, one byte for the opcode) and thus we can find the call by looking five bytes prior to the return:

1: kd> u f6c3230d
smwdm+0xb30d:
f6c3230d e83afeffff       call    smwdm+0xb14c (f6c3214c)
f6c32312 eb0b             jmp     smwdm+0xb31f (f6c3231f)

This latter approach is generally faster. Regardless of the specific approach, we can now display the entire disassembly range:

1: kd> u f6c3214c f6c32192
smwdm+0xb14c:
f6c3214c 53               push    ebx
f6c3214d 56               push    esi
f6c3214e 57               push    edi
f6c3214f 8bf1             mov     esi,ecx
f6c32151 e8d4ffffff       call    smwdm+0xb12a (f6c3212a)
f6c32156 8bf8             mov     edi,eax
f6c32158 85ff             test    edi,edi
f6c3215a 741f             jz      smwdm+0xb17b (f6c3217b)
f6c3215c 8b4f10           mov     ecx,[edi+0x10]
f6c3215f 8d86c8000000     lea     eax,[esi+0xc8]
f6c32165 33d2             xor     edx,edx
f6c32167 8b18             mov     ebx,[eax]
f6c32169 8b4004           mov     eax,[eax+0x4]
f6c3216c 03cb             add     ecx,ebx
f6c3216e 13d0             adc     edx,eax
f6c32170 8d86c8000000     lea     eax,[esi+0xc8]
f6c32176 8908             mov     [eax],ecx
f6c32178 895004           mov     [eax+0x4],edx
f6c3217b 8b5c2414         mov     ebx,[esp+0x14]
f6c3217f 8d8ebc000000     lea     ecx,[esi+0xbc]
f6c32185 8a13             mov     dl,[ebx]
f6c32187 ff1510dbc6f6     call    dword ptr [smwdm+0x46b10 (f6c6db10)]
f6c3218d 8b461c           mov     eax,[esi+0x1c]
f6c32190 ff37             push    dword ptr [edi]

It was this final push that led to the invalid memory reference. Any time we see an invalid memory reference like this, we try to find out from whence the address came from. In this case we can see that EDI was loaded earlier (f6c32156). Since this came from EAX immediately following a call, we know that this is the return value from this function. The call is another call into the driver:

smwdm+0xb12a:
f6c3212a 8b8104010000     mov     eax,[ecx+0x104]
f6c32130 81c104010000     add     ecx,0x104
f6c32136 3bc1             cmp     eax,ecx
f6c32138 7503             jnz     smwdm+0xb13d (f6c3213d)
f6c3213a 33c0             xor     eax,eax
f6c3213c c3               ret

From this code snippet, it would appear they are checking some sort of list. ECX contains the address of a data structure. It looks at the contents of offset 0x104 and if this is equal to the memory address of offset 0x104 then the xor will be executed (setting EAX to zero) and return; thus clearly this is the case.

Finding the actual data structure took a bit of digging because of the use of the fastcall optimization here - ECX contains the address of the data structure. The function that called this has damaged ECX prior to the trap frame being recorded, so we walked up to the previous call - and this gave us a hint:

f6c0d83d 8b08             mov     ecx,[eax]
f6c0d83f 57               push    edi
f6c0d840 50               push    eax
f6c0d841 ff5110           call    dword ptr [ecx+0x10]
f6c0d844 85c0             test    eax,eax

The call is into this driver (presumably this is the entry point table of the driver). By looking at the stack we can find the value of EAX at the point of the call (since it was pushed onto the stack - assuming it has not been modified since then). The address we found on the stack (you can see it in the earlier stack dump) was 85d1b3c8 (later, we found that this value was preserved in ESI as well). Using the "!pool" command we can see the value of this pool tag:

1: kd> !pool 85d1b3c8
Pool page 85d1b3c8 region is Nonpaged pool
 85d1b000 size:  350 previous size:    0  (Free)       Io 
 85d1b350 size:   68 previous size:  350  (Allocated)  MmCa
 85d1b3b8 size:    8 previous size:   68  (Free)       Ntfr
*85d1b3c0 size:  620 previous size:    8  (Allocated) *PcNw
  Pooltag PcNw : WDM audio stuff
 85d1b9e0 size:   18 previous size:  620  (Allocated)  ReEv
 85d1b9f8 size:  608 previous size:   18  (Free)       Ntfr

Of course, this merely confirms our suspicion that this relates to the WDM audio stack (someone more familiar with this stack probably will recognize this data structure). The value in ECX was the location at that memory address, however:

1: kd> dd 85d1b3c8 l1
85d1b3c8  f6c6e7ac

Using this with the pool command again, leads us into never-never land:

1: kd> !pool f6c6e7ac
Pool page f6c6e7ac region is Nonpaged pool
f6c6e000 is not a valid small pool allocation, checking large pool...
f6c6e000 is freed (or corrupt) pool
Bad allocation size @f6c6e000, zero is invalid

***
*** An error (or corruption) in the pool was detected;
*** Attempting to diagnose the problem.
***
*** Use !poolval f6c6e000 for more details.
***

Pool page [ f6c6e000 ] is INVALID.

Analyzing linked list...
[ f6c6e000 ]: invalid previous size [ 0x195 ] should be [ 0x0 ]
[ f6c6e000 --> f6c6e288 (size = 0x288 bytes)]: Corrupt region
[ f6c6e4a0 --> f6c6e9e8 (size = 0x548 bytes)]: Corrupt region


Scanning for single bit errors...

None found

By using the lm (list modules) command we are able to find this memory block - it falls inside of the driver itself:

f6c27000 f6c9b680   smwdm        (no symbols)

Thus, our suspicion is that this is a data structure within the driver itself (perhaps a global structure). Unfortunately, without knowing more about the structure of these drivers we cannot say what this data structure should (or would) represent. Further digging might be necessary, but let’s return back to our original analysis. This function return 0x0 because this list was empty. Thus, we are going to skip to the requisite code:

f6c32156 8bf8             mov     edi,eax
f6c32158 85ff             test    edi,edi
f6c3215a 741f             jz      smwdm+0xb17b (f6c3217b)

This would correspond to an if block (hence if ( return value != 0) { … }) without a corresponding else block (otherwise we’d see a couple of immediate jmp instructions.) Once outside the if block, we see the cleanup code:

f6c3217b 8b5c2414         mov     ebx,[esp+0x14]
f6c3217f 8d8ebc000000     lea     ecx,[esi+0xbc]
f6c32185 8a13             mov     dl,[ebx]
f6c32187 ff1510dbc6f6     call    dword ptr [smwdm+0x46b10 (f6c6db10)]

In this code we load the first parameter (on the stack, so third parameter to this function since it uses fastcall) into EBX, load the value of ESI+0xbc into ECX and then call the function pointed to by f6c6db10, which is:

1: kd> dd f6c6db10 l1
f6c6db10  806bb780
1: kd> u @$p
hal!KfReleaseSpinLock:

Thus, this suggests that we’re releasing the spin lock. Recall that earlier we saw a spin lock acquisition - checking back, we note that this was at the same byte offset (and hence likely to be the same spin lock). This then leads us into the terminal push operation. Thus, even without symbols for this driver, we can see the fundamental flaw in this driver - and presumably in its testing.

The driver code in this case returns a NULL pointer (because the list is empty) and yet after releasing its own spin lock dereferences this NULL pointer. With this level of information it would seem that a problem of this type would be straight-forward to resolve and fix with the developers.

Of course, we did report it to the developers of this audio driver. While they acknowledged that this is a bug in their driver, we were told that they would not commit to fixing it. Guess it is time to find a new sound card!

Related Articles
Undesired Debugger Behavior

User Comments
Rate this article and give us feedback. Do you find anything missing? Share your opinion with the community!
Post Your Comment

Post Your Comments.
Print this article.
Email this article.
bottom nav links