Part 3: Understanding !PTE - Non-PAE and X64

June 22, 2010, 10:01 am

≫ Next: We Are Hiring - Windows Server Escalation Engineer Positions

≪ Previous: ‘Hidden’ ETW Stack Trace Feature – Get Stacks All Over the Place!

Hello, Ryan Mangipano (ryanman) again with part three of my series on understanding the output of the !PTE command. In this last installment I’ll continue our manual conversion of Virtual Addresses by converting a Non-PAE VA. Afterwards I’ll convert a VA from X64 Long Mode. Then I’ll discuss the TLB. If you haven’t read part one and two of this series I recommend taking a looking before jumping into the rest of this article.

Non-PAE system VA to Physical Address conversion

In part 1 and part 2 I discussed using windbg to manually x86 PAE virtual to physical address conversion with 4KB and 2MB pages. Now I’ll mention the same thing for Non-PAE systems. The official processor manuals explain how systems not using PAE have only two level of tables instead of the three used by PAE. This fact is because PAE added the Page Directory Pointer Table. This means the CR3 register will contain a pointer to the Page Directory Table. Also note the size of the table entries will be 32-bit, not the 64-bit size observed in the PAE tables. PAE expanded the table size to allow for more bits for the purpose of allowing the addressing of more physical memory.

The PAE bit is bit number five, which is the sixth bit due to bit numbering starting at zero. You can see PAE is not enabled on this system.

0: kd> .formats @cr4

Binary: 00000000 00000000 00000110 11010001

Another method of checking this:

1: kd> j ((@cr4 & 0y00000000000000000000000000100000) != 0) '.echo PAE flag Enabled';'.echo PAE flag Disabled'

PAE flag Enabled

Now I'll use the following random virtual address which is valid:

0: kd> !pte f72c5c00

F72C5C00 - PDE at C0300F70 PTE at C03DCB14

contains 01014963 contains 06CE7963

pfn 1014 G-DA--KWV pfn 6ce7 G-DA—KWV

0: kd> .formats f72c5c00

Binary: 11110111 00101100 01011100 00000000

Let’s break it down:

PD offset 11110111 00

Page Table Offset 101100 0101

Offset in the Physical Page 1100 00000000 (since it’s a even 12 bits, just refer to it in hex as c00)

0: kd> !dd @cr3 + (0y1111011100 * @@(sizeof(nt!HARDWARE_PTE)))L1

# a07df70 01014963

0: kd> !dd 1014000 + (0y1011000101 * @@(sizeof(nt!HARDWARE_PTE)))L1

# 1014b14 06ce7963

Now that I have the physical page base, I'll place the last 3 hex digits (c00) from the Virtual Address onto the address base.

0: kd> !dd 6ce7c00 L4

# 6ce7c00 00000001 c0000005 00000000 00000000

0: kd> dd f72c5c00 L4

f72c5c00 00000001 c0000005 00000000 00000000

X64 VA to Physical Address Conversion

Just as PAE added a third level to the non-PAE two-level system, x64 Long mode adds a fourth level to the hierarchy. This table is called the Page-Map Level-4 (PML4 table). AMD refers to the entries in this table as PML4E (Page-Map Level-4 Entry). Intel refers to each entry as PML4-Table Entry. Internally we refer to this as the eXtended Page directory Entry (PXE). Regardless of how you refer to these entries they contain indexes into the PDP table (Page Directory Pointer Table).

Here is the output of the !pte command against a 64-bit address:

7: kd> !pte fffffade`c24eb7c0

VA fffffadec24eb7c0

PXE @ FFFFF6FB7DBEDFA8 PPE at FFFFF6FB7DBF5BD8 PDE at FFFFF6FB7EB7B090 PTE at FFFFF6FD6F612758

contains 0000000111800863 contains 0000000119826863 contains 0000000119839963 contains 0000000001FF6121

pfn 111800 ---DA--KWEV pfn 119826 ---DA--KWEV pfn 119839 -G-DA--KWEV pfn 1ff6 -G--A—KREV

I'll break it down in binary and use data from the processor manuals to separate the bits

7: kd> .formats fffffade`c24eb7c0

Binary: 11111111 11111111 11111010 11011110 11000010 01001110 10110111 11000000

Sign extend 11111111 11111111

PML4 offset 11111010 1

PDP offset 1011110 11

PD offset 000010 010

Page-Table offset 01110 1011

Physical Page Offset 0111 11000000

Now that I have the numbers, I'll plug them in and find the physical address. If you are having problems following along, refer to part one of this blog and the AMD x64 System Programming manual. You should be comparing the output below to the !pte output above

7: kd> !dq @cr3 + ( 0y111110101 * @@(sizeof(ntkrnlmp!HARDWARE_PTE))) L1

# 147fa8 00000001`11800863

7: kd> !dq 0x00111800000 + ( 0y101111011 * @@(sizeof(ntkrnlmp!HARDWARE_PTE))) L1

#111800bd8 00000001`19826863

7: kd> !dq 0x119826000 + ( 0y000010010 * @@(sizeof(ntkrnlmp!HARDWARE_PTE))) L1

#119826090 00000001`19839963

7: kd> !dq 0x119839000 + ( 0y011101011 * @@(sizeof(ntkrnlmp!HARDWARE_PTE))) L1

#119839758 00000000`01ff6121

7: kd> !dc 1ff67c0 L4

# 1ff67c0 5085ff48 48000005 68244c8b 04a8f633 H..P...H.L$h3...

7: kd> dc fffffade`c24eb7c0 L4

fffffade`c24eb7c0 5085ff48 48000005 68244c8b 04a8f633 H..P...H.L$h3...

TLB- Translation Lookaside Buffer and Conclusion

The CPU’s memory management unit performs these operations to translate virtual addresses to physical. Wouldn’t it be great if we could cache the virtual address to physical page information in a location that can be accessed very quickly so that the CPU doesn’t have to look this up for future references to this page? That is just what the Translation Lookaside Buffer (TLB) does. Hopefully this will shed some light on some basic memory structures like Large Pages, Flags, and the TLB so I encourage you to read more about these topics from the following sources-

How PAE x86 works (on MSDN): http://technet.microsoft.com/en-us/library/cc736309(WS.10).aspx

Intel & AMD processor manuals: http://www.intel.com/products/processor/manuals/index.htm and http://developer.amd.com/documentation/guides/Pages/default.aspx#manuals

“Windows Internals, 5^th Edition” Mark E. Russinovich and David A. Solomon with Alex Ionescu -Chapter 9: Memory Management

↧

We Are Hiring - Windows Server Escalation Engineer Positions

June 28, 2010, 9:34 am

≫ Next: I Want A Debugger Robot

≪ Previous: Part 3: Understanding !PTE - Non-PAE and X64

Ron here again sending out another reminder about postings for our team. If you like to debug this may be for you-

Want to debug the most difficult problems against the largest, most complex piece of software ever written? Are you fluent in C and assembly? Do you carry a spare debugger cable in your bag? Can you represent Microsoft in critical, time sensitive solution delivery? Writing code is easy, finding bugs in someone else’s code you’ve never seen before is a real challenge. Want to hear more?

http://blogs.msdn.com/jobsblog/archive/2007/01/26/jobcast-life-as-an-escalation-engineer.aspx

http://blogs.msdn.com/jobsblog/archive/2007/11/21/life-as-an-escalation-engineer-part-ii.aspx

Candidate must be a strong critical thinker, and enjoy solving very difficult problems (often involving code level analysis). Escalation Engineers within Platforms Global Escalation Services are frequently involved with the highest profile issues, and therefore must be able to handle both pressure and complex situations. Candidates must also have strong customer service, accurate logical problem solving and communication skills, and the ability to work in a team environment. Programming and debugging skills are required, preferably in C or C++. Also, the ability to read and analyze network traces and solid protocol analysis is a plus. The ideal candidate may have a four year degree in C.S. or Engineering and a minimum of four years product support experience or the equivalent in work experience. Prior knowledge of Windows Server and other networking products, and/or networking operating systems is required. If you enjoy being the problem-solver in the spotlight of critical problems, then this position will excite and challenge you.

Applicants that possess the experience and desire will be considered, and are encouraged to apply using the below link.

https://careers.microsoft.com/JobDetails.aspx?ss=&pg=0&so=&rw=1&jid=18002&jlang=EN

↧

I Want A Debugger Robot

June 30, 2010, 4:54 pm

≫ Next: We Are Hiring - Windows Server Escalation Engineer Position

≪ Previous: We Are Hiring - Windows Server Escalation Engineer Positions

Hi,

My name is Sabin from the Platforms Global Escalation Services team at Microsoft, and today I want to share with you a recent experience I had debugging an issue reported by an hardware manufacturer.

The customer was doing a reboot test for their new server product line. They found that after hundreds of continuous reboots there was always a single instance that the server took more than 20 minutes to start up, when compared to an average 2 minute normal startup time. This happened only once every 300+ to 1000+ reboots. The number of reboots it took before the problem happened again varied randomly so it was difficult to predict when the problem would occur.

Although they setup a live kernel debugging environment, they didn’t want to watch the computer screen for 10+ hours waiting for the problem to happen so they could manually hit Ctrl+Break in windbg. So instead they setup a video camera to film the computer screen 24x7, and they managed to find that when the “mysterious delay” happened the computer showed a gray screen with “Microsoft (R) Windows (R) version 5.1 (Build 3790: Service Pack 2)”.

The case came to me and the customer even shipped a problematic server to our office to troubleshoot the cause of the delay. The problem was that I didn’t want to stare at the computer screen for 10+ hours either!

The first thing I thought was that it would be fantastic if there were a robot sitting in front of Windbg, watching the elapsed time for each reboot, so it could hit Ctrl+Break in windbg if the server takes more than 10 minute to start. Then I asked myself, “Why not?”

I decided to build such a “robot” myself. I went around and checked the Debuggers SDK document (which can be found in the windbg help document debugger.chm), and I realized that what I needed was a customized debugger. The functionality of the debugger is simple, it should be able to recognize the time when the server first starts and the time when the server reboots. If there is more than 10 minutes between these two times the customized debugger automatically breaks in to the server. The event callback interface IDebugEventCallbacks::SessionStatus and the client interface IDebugControl::SetInterrupt can meet my needs perfectly.

It is not that difficult to build such a customized debugger, which I called DBGRobot. I would like to share some code snippets which you may find helpful when building a customized debugger for a special debugging scenario, or as the basis for building a more complicated debugging robot.

First, we need to download and install the Windows Driver Kit Version 7.1.0. When installing the WDK be sure to select Debugging Tools for Windows.

http://www.microsoft.com/whdc/DevTools/WDK/WDKpkg.mspx

If you install the WDK to its default folder, which for version 7.1.0 is C:\WinDDK\7600.16385.1, the C:\WinDDK\7600.16385.1\Debuggers\sdk\samples folder will contain the sample code from the Debugger SDK. The dumpstk sample is the one particularly interesting to us. We can copy some common code from it, such as the out.cpp and out.hpp which is the implementation of the IDebugOutputCallbacks interface.

Now let’s do some coding. The common code is copied from the Debuggers SDK sample Dumpstk. I also listed it here for clarity.

The first step is to create the IDebugClient, IDebugControl and IDebugSymbols interfaces (although IDebugSymbols is not used in this case). You need to call the DebugCreate() function to create the IDebugClient interface, and then use IDebugClient->QueryInterface() to query the IDebugControl and IDebugSymbols interfaces.

void

CreateInterfaces(void)

{

HRESULT Status;

// Start things off by getting an initial interface from

// the engine. This can be any engine interface but is

// generally IDebugClient as the client interface is

// where sessions are started.

if ((Status = DebugCreate(__uuidof(IDebugClient),

(void**)&g_Client)) != S_OK)

{

Exit(1, "DebugCreate failed, 0x%X\n", Status);

}

// Query for some other interfaces that we'll need.

if ((Status = g_Client->QueryInterface(__uuidof(IDebugControl),

(void**)&g_Control)) != S_OK ||

(Status = g_Client->QueryInterface(__uuidof(IDebugSymbols),

(void**)&g_Symbols)) != S_OK)

{

Exit(1, "QueryInterface failed, 0x%X\n", Status);

}

If you want to see the output from the debugging engine, you also need to implement the IDebugOutputCallbacks interface. The main function to be implemented is IDebugOutputCallbacks::Output(), which is quite simple as we only need to see the output in the command prompt stdout stream:

STDMETHODIMP

StdioOutputCallbacks::Output(

THIS_

IN ULONG Mask,

IN PCSTR Text

)

{

UNREFERENCED_PARAMETER(Mask);

fputs(Text, stdout);

return S_OK;

}

Here comes our main code logic: we need to implement the IDebugEventCallbacks interface and monitor the SessionStatus events. In order for the debugger engine to deliver the SessionStatus events to us we need to set the DEBUG_EVENT_SESSION_STATUS mask in IDebugEventCallbacks::GetInterestMask():

STDMETHODIMP_(HRESULT)

RobotEventCallBacks::GetInterestMask(

THIS_

OUT PULONG Mask

)

{

// Here we monitor all the event types, although it is not necessary

*Mask = DEBUG_EVENT_SESSION_STATUS | DEBUG_EVENT_EXCEPTION | DEBUG_EVENT_CHANGE_DEBUGGEE_STATE | DEBUG_EVENT_CHANGE_ENGINE_STATE;

return S_OK;

}

In the RobotEventCallBacks::SessionStatus() function, if we receive the DEBUG_SESSION_ACTIVE session status we know that Windows has started and connected to the debug session. Once the OS has started we create a new thread that waits for hEvent, an unnamed global event. In the new thread we send a timeout value for 10 minutes to the WaitForSingleObject() call. The hEvent will be set when the session status is DEBUG_SESSION_END or DEBUG_SESSION_REBOOT, which means that the server rebooted and disconnected from the debug session. If WaitForSingleObject() returns WAIT_TIMEOUT, indicating that the server has not rebooted after 10 minutes, we send a break to the server via g_Control->SetInterrupt(DEBUG_INTERRUPT_ACTIVE).

STDMETHODIMP_(HRESULT)

RobotEventCallBacks::SessionStatus(

THIS_

IN ULONG Status

)

{

printf("Session Status changed, status = %d \n", Status);

if (Status == DEBUG_SESSION_ACTIVE)

{

//bSessionActive = TRUE;

::CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)TimerThread, NULL, 0, NULL);

}

else if ((Status == DEBUG_SESSION_END) || (Status == DEBUG_SESSION_REBOOT) )

{

//bSessionActive = FALSE;

SetEvent(hEvent);

}

return S_OK;

}

STDMETHODIMP_(HRESULT)

RobotEventCallBacks::Exception(

THIS_

IN PEXCEPTION_RECORD64 Exception,

IN ULONG FirstChance

)

{

printf("Captured exception 0x%x, FirstChance = %d", Exception->ExceptionCode, FirstChance);

if (FirstChance && Exception->ExceptionCode == 0x80000003)

{

bInterrupted = TRUE;

}

return DEBUG_STATUS_NO_CHANGE;

}

void TimerThread()

{

HRESULT Status;

DWORD result;

result = WaitForSingleObject(hEvent,10*60*1000);

if (result == WAIT_OBJECT_0) //the close event is signled within 10 minutes

{

printf("Target reboots within 10 minutes \n");

}

else if (result == WAIT_TIMEOUT) //the close event is not singled within 10 minute, it means the server hangs

{

if ((Status = g_Control->SetInterrupt(DEBUG_INTERRUPT_ACTIVE)) !=S_OK) //force the target to break

{

Exit(1, "SetInterrupt failed, 0x%x\n", Status);

}

Below is our main function. In the main function we need to create the interfaces, and set the callback interfaces g_Client->SetOutputCallbacks and g_Client->SetEventCallbacks. Then we call g_Control->WaitForEvent() in the while(TRUE) loop, which is a typical main loop for a debugger program. If WaitForEvent() returns it means that a debugging event happened, such an exception, and then you can send some debug commands via g_Control->Execute. Here we simply send a “.time” command to record the problematic time. The bInterrupted boolean is set in the RobotEventCallBacks::Exception() function when it detects a first chance 80000003 exception, which means a debugger break-in.

Note: Initially I wanted to send a “.crash” command to crash the server and generate the memory dump. When I tried this I found that although the “.crash” command did trigger a blue screen it failed to generate a memory dump because the problem happened so early in the boot process and the paging file was not initialized yet. So I changed it to just send a “.time” command and performed live debugging after the problem was captured by the tool. Initially I had forgotten to change the printf output, which is why you see the output “Succeeded in capturing the problem and crashing the server!” in the below screenshot.

void __cdecl

main(int argc, char* argv[])

{

HRESULT Status;

hEvent = CreateEvent(NULL, FALSE, FALSE, NULL); //create the global hEvent

CreateInterfaces(); //create the global interfaces

if ((Status = g_Client->SetOutputCallbacks(&g_OutputCb)) != S_OK) //set the output callback interface

{

Exit(1, "SetOutputCallbacks failed, 0x%X\n", Status);

}

if ((Status = g_Client->SetEventCallbacks(&g_EventCb)) != S_OK) //Set the event callback interface

{

Exit(1, "SetEventCallbacks failed, 0x%X\n", Status);

}

Status = g_Client->AttachKernel(DEBUG_ATTACH_KERNEL_CONNECTION, argv[1]); //Attach to the kernel debugging, the argv[1] should be “COM:Port=COM1,baud=115200”

if (Status != S_OK)

{

Exit(1, "AttachKernel failed, 0x%X\n", Status);

}

while(TRUE) //This is the main loop of the debugger program, it calls g_Control->WaitForEvent() repeatedly waiting for the debugging events

{

Status = g_Control->WaitForEvent(DEBUG_WAIT_DEFAULT,

INFINITE); //wait for 10 minutes

printf("WaitForEvent returned. Status = 0x%x\n", Status);

if (Status == S_OK && bInterrupted)

{

if ((Status = g_Control->Execute(DEBUG_OUTCTL_THIS_CLIENT, ".time", DEBUG_EXECUTE_ECHO)) !=S_OK) //do .time to record the time the problem happens

{

Exit(1, "Execute failed, 0x%x\n", Status);

}

Exit(0, "Succeeded in capturing the problem!");

}

Exit(0, NULL);

}

Now that we have the code we can build DBGRobot.exe using the WDK 7.1.0 build environment. Remember to set the following environment variables, as mentioned in the readme.txt under \WinDDK\7600.16385.1\Debuggers\samples

SET DBGLIB_LIB_PATH=C:\WinDDK\7600.16385.1\Debuggers\sdk\lib

SET DBGSDK_INC_PATH=C:\WinDDK\7600.16385.1\Debuggers\sdk\inc

SET DBGSDK_LIB_PATH=C:\WinDDK\7600.16385.1\Debuggers\sdk\lib

Now I have the “robot” watching the problem for me. I set up the live kernel debugging environment as usual and then ran the DBGRobot.exe tool as the debugger instead of windbg. I used the command: dbgrobot.exe com:port=COM1,baudrate=115200.

This is the screenshot when the server starts and reboots normally:

I let the DBGRobot.exe run over night, and in the morning when I came back to the office, the “robot” had done its job perfectly and captured the problem for me! The problematic server was left in the break-in state so I just needed to launch windbg to attach to the server, and then I could do the debugging in windbg as normal.

The cause of the delay turned out to be a hardware problem, but the actual problem is not important here. I hope this article can give you some new ideas about debugging, and using the debug tools in new ways.

↧

We Are Hiring - Windows Server Escalation Engineer Position

July 20, 2010, 8:15 am

≫ Next: Debugging services startup in Svchost from a kernel mode debug session

≪ Previous: I Want A Debugger Robot

We have one remaining Windows Server Escalation Engineer position!

http://blogs.msdn.com/jobsblog/archive/2007/01/26/jobcast-life-as-an-escalation-engineer.aspx

http://blogs.msdn.com/jobsblog/archive/2007/11/21/life-as-an-escalation-engineer-part-ii.aspx

Candidate must be a strong critical thinker, and enjoy solving very difficult problems (often involving code level analysis). Escalation Engineers within Platforms Global Escalation Services are frequently involved with the highest profile issues, and therefore must be able to handle both pressure and complex situations. Candidates must also have strong customer service, accurate logical problem solving and communication skills, and the ability to work in a team environment. Programming and debugging skills are required, preferably in C or C++. Also, the ability to read and analyze network traces and solid protocol analysis is a plus. The ideal candidate may have a four year degree in C.S. or Engineering and a minimum of four years product support experience or the equivalent in work experience. Prior knowledge of Windows Server and other networking products, and/or networking operating systems is required. If you enjoy being the problem-solver in the spotlight of critical problems, then this position will excite and challenge you.

Applicants that possess the experience and desire will be considered, and are encouraged to apply using the below link.

https://careers.microsoft.com/JobDetails.aspx?ss=&pg=0&so=&rw=1&jid=18002&jlang=EN

↧

Debugging services startup in Svchost from a kernel mode debug session

July 20, 2010, 12:36 pm

≫ Next: Debugging a Bugcheck 0x109

≪ Previous: We Are Hiring - Windows Server Escalation Engineer Position

Windows shared services allow us to run system services together in a single service by having multiple DLLs run in a single process called Svchost. This allows Windows to have many services to run with the overhead of a single process. You can find more information about shared services here and here.

I like to do a lot of my user mode debugging from a kernel mode debugger. I can just attach to a Hyper-V VM though a virtual serial port and all my tools are available to me without having to set up a user mode debugger on the target, plus if my application calls into kernel mode I’m all set.

The Svchost process won’t load the DLL until it’s time to start the service, so it can be tricky to set the correct breakpoint to catch the service starting up. This is complicated by the fact that the dll entry point will usually be <module>!ServiceMain, however this is not necessarily the case.

We can leverage the fact that svchost.exe exists in the same shared memory across all the Svchost processes, and a change to this shared memory will affect all of the svchost process. When you set a user mode breakpoint from a kernel mode debugger the debugger changes the user mode code by inserting an int 3 (the debugger changes the code back when it is executed). We can take advantage of this functionality to set one breakpoint that will fire from any svchost process.

If we can get past the Svchost process loading the service dll, we can find the host process main function or any other function we want to set a breakpoint on.

First, find the process address of any Svchost. I’m just showing the first of a long list of processes:

kd> !process 0 0 svchost.exe

PROCESS fffffa80098c5970

SessionId: 0 Cid: 0270 Peb: 7fffffd8000 ParentCid: 0204

DirBase: 35986000 ObjectTable: fffff8a001115b20 HandleCount: 343.

Image: svchost.exe

Now let’s get into the context of this Svchost (it doesn’t matter which one) and set a breakpoint. .process –i sets an invasive breakpoint so we can get into the active context of the process.

kd> .process -i fffffa80098c5970

You need to continue execution (press 'g' <enter>) for the context

to be switched. When the debugger breaks in again, you will be in

the new process context.

Then go the debugger to enter the active process.

kd> g

Break instruction exception - code 80000003 (first chance)

nt!DbgBreakPointWithStatus:

GetServiceMainFunctions() is where Svchost both loads the service DLL and gets the service main function so let’s set a breakpoint for it:

kd> .reload /user

Loading User Symbols

.......................................................

kd> bp svchost!GetServiceMainFunctions

Now I just go the debugger and start my service. I’ll use the workstation service as an example (I had previously stopped the workstation service for this test).

kd> g

At a command prompt on the target I enter:

C:\> net start workstation

In the debugger I get:

Breakpoint 0 hit

svchost!GetServiceMainFunctions:

0033:00000000`ffc813cc fff3 push rbx

kd> k

Child-SP RetAddr Call Site

00000000`0339f7f8 00000000`ffc8128d svchost!GetServiceMainFunctions

00000000`0339f800 000007fe`fe38a82d svchost!ServiceStarter+0xba

00000000`0339f890 00000000`77a5f56d sechost!ScSvcctrlThreadW+0x25

00000000`0339f8c0 00000000`77b93281 kernel32!BaseThreadInitThunk+0xd

00000000`0339f8f0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

svchost!GetServiceMainFunctions is called from Svchost!ServiceStarter. I just leave GetServiceMainFunctions with gu (Go Up). Now the service dll will have been loaded and I can load symbols for wkssvc.dll and get the service main() function.

kd> gu

svchost!ServiceStarter+0xba:

0033:00000000`ffc8128d 488b7c2430 mov rdi,qword ptr [rsp+30h] ss:002b:00000000`0339f830=000007fefb443a7c

kd> k

Child-SP RetAddr Call Site

00000000`0339f800 000007fe`fe38a82d svchost!ServiceStarter+0xba

00000000`0339f890 00000000`77a5f56d sechost!ScSvcctrlThreadW+0x25

00000000`0339f8c0 00000000`77b93281 kernel32!BaseThreadInitThunk+0xd

00000000`0339f8f0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

Now loading my user symbols I can get all the symbols for wkssvc.dll

kd> .reload /user

Loading User Symbols

................................................................

kd> dps 00000000`0339f800 l7

00000000`0339f800 00000000`00000000

00000000`0339f808 00000000`00000000

00000000`0339f810 00000000`00000000

00000000`0339f818 00000000`00000000

00000000`0339f820 00000000`00000000

00000000`0339f828 00000000`00000000

00000000`0339f830 000007fe`fb443a7c wkssvc!ServiceMain <- Here is the Service Main function.

I can now set a breakpoint on wkssvc!ServiceMain, and as a bonus I now have access to all available breakpoints in wkssvc.dll

kd> bp wkssvc!ServiceMain

kd> g

Breakpoint 1 hit

wkssvc!ServiceMain:

0033:000007fe`fb443a7c 48895c2408 mov qword ptr [rsp+8],rbx ss:002b:00000000`0339f800={wkssvc!SvchostPushServiceGlobals (000007fe`fb442ff4)}

kd> k

Child-SP RetAddr Call Site

00000000`0339f7f8 00000000`ffc81344 wkssvc!ServiceMain

00000000`0339f800 000007fe`fe38a82d svchost!ServiceStarter+0x1e8

00000000`0339f890 00000000`77a5f56d sechost!ScSvcctrlThreadW+0x25

00000000`0339f8c0 00000000`77b93281 kernel32!BaseThreadInitThunk+0xd

00000000`0339f8f0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

Now I can debug the service on startup, not caring what instance of svchost it’s in with full user and kernel mode access.

↧

Debugging a Bugcheck 0x109

August 10, 2010, 3:30 pm

≫ Next: We Are Hiring - Windows Server Escalation Engineer

≪ Previous: Debugging services startup in Svchost from a kernel mode debug session

My name is Nader Khonsari. I am an escalation engineer in Platforms Global Escalation Services. I want to share with you a recent experience where 64-bit Windows Server 2008 servers at a customer location were encountering bugcheck 0x109 blue screen crashes.

In 64-bit versions of the Windows kernel PatchGuard is present. If any driver or application attempts to modify the kernel the PatchGuard will generate the bugcheck (CRITICAL_STRUCTURE_CORRUPTION) mentioned below. PatchGuard protects the kernel from modification by malicious or badly written drivers or software.

To further investigate this bugcheck you need to compare the impacted kernel function with a known reliable one. For instance, if the machine encountering this was running Windows Server 2008 service pack 2 with a post SP2 hotfix kernel you need to compare the impacted kernel function with that of service pack 2 kernel function. Usually you do not need to download and extract the post SP2 hotfix, because the vast majority of the kernel code has not been modified since the service pack.

If you already have service pack 2 for Windows Server 2008 handy, expand the package using instructions included in KB928636:

Windows6.0-KB948465-X64.exe /x

expand.exe -f:* C:\WS08\SP2\windows6.0-kb948465-X64.cab C:\WS08\SP2\Expanded

Locate the kernel binary from the expanded binaries and then open it up with your debugger just like you open a crash memory dump.

windbg -z C:\WS08\SP2\Expanded\amd64_microsoft-windows-os-kernel_31bf3856ad364e35_6.0.6002.18005_none_ca3a763069a24eea\ntoskrnl.exe

This is the bugcheck data from the dump:

CRITICAL_STRUCTURE_CORRUPTION (109)

This bugcheck is generated when the kernel detects that critical kernel code or

data have been corrupted. There are generally three causes for a corruption:

1) A driver has inadvertently or deliberately modified critical kernel code

or data. See http://www.microsoft.com/whdc/driver/kernel/64bitPatching.mspx

2) A developer attempted to set a normal kernel breakpoint using a kernel

debugger that was not attached when the system was booted. Normal breakpoints,

"bp", can only be set if the debugger is attached at boot time. Hardware

breakpoints, "ba", can be set at any time.

3) A hardware corruption occurred, e.g. failing RAM holding kernel code or data.

Arguments:

Arg1: a3a039d89b456543, Reserved

Arg2: b3b7465eedc23277, Reserved

Arg3: fffff80001778470, Failure type dependent information

Arg4: 0000000000000001, Type of corrupted region, can be

0 : A generic data region

1 : Modification of a function or .pdata

2 : A processor IDT

3 : A processor GDT

4 : Type 1 process list corruption

5 : Type 2 process list corruption

6 : Debug routine modification

7 : Critical MSR modification

Next, check the address at Arg3. This will give you the function that was modified, but not the offset of the modified instruction.

3: kd> ln fffff80001778470

(fffff800`01778470) nt!KeSetSystemTime | (fffff800`01778790) nt!BiLoadSystemStore

Exact matches:

nt!KeSetSystemTime = <no type information>

Unassemble the same function in the SP2 kernel binary you expanded from the SP2 package. Do the same with the function of the crashed kernel and compare the two. You will find the modified opcode compared to that of the unmodified kernel.

Below is the comparison of the nt!KeSetSystemTime code of the crashed kernel and that of the service pack 2 kernel respectively. They match fine except for the highlighted byte in the prefetch instruction which has been overwritten with a 0x1f. This changed the instruction to a nop, which is done to prevent the prefetch operation from occurring on processors that don't support prefetch.

nt!KeSetSystemTime+0x156:

fffff800`017785c6 0f1f0f nop dword ptr [rdi]

fffff800`017785c9 488b07 mov rax,qword ptr [rdi]

fffff800`017785cc 493bc7 cmp rax,r15

fffff800`017785cf 7516 jne nt!KeSetSystemTime+0x177

ntoskrnl!KeSetSystemTime+0x156:

00000001`4012e5b6 0f0d0f prefetchw [rdi]

00000001`4012e5b9 488b07 mov rax,qword ptr [rdi]

00000001`4012e5bc 493bc7 cmp rax,r15

00000001`4012e5bf 7516 jne

After further investigation this turned out to be a known issue in the VMware environment when the VM is moved from a non-prefetch to a prefetch architecture and even then, only in a live-migration case. The issue is documented on VMWare's site at http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1008749&sliceId=1&docTypeID=DT_KB_1_1&dialogID=74787167&stateId=0 .

↧

We Are Hiring - Windows Server Escalation Engineer

August 13, 2010, 3:41 pm

≫ Next: Expanding Netmon to aid in debugging

≪ Previous: Debugging a Bugcheck 0x109

Would you like to join the world’s best and most elite debuggers to enable the success of Microsoft solutions?

As a trusted advisor to our top customers you will be working with to the most experienced IT professionals and developers in the industry. You will influence our product teams in sustained engineering efforts to drive improvements in our products.

This role involves deep analysis of product source code and debugging to solve problems in multi-million dollar configurations and will give you an opportunity to stretch your critical thinking skills. During the course of debugging, you may uncover opportunities to improve the customer experience while influencing the current and future design of our products.

In addition to providing support to customers while being the primary interface to our sustained engineering teams, you will also have the opportunity to work with new technologies and unreleased software. Through our continuous investment in depth training and hands-on experience with tough customer challenges you will become the world’s best in this area. Expect to partner with many various roles at Microsoft launching a very successful career!

Learn more at:

http://blogs.msdn.com/jobsblog/archive/2007/01/26/jobcast-life-as-an-escalation-engineer.aspx

http://blogs.msdn.com/jobsblog/archive/2007/11/21/life-as-an-escalation-engineer-part-ii.aspx

Apply here:

https://careers.microsoft.com/JobDetails.aspx?ss=&pg=0&so=&rw=1&jid=18002&jlang=EN

↧

Expanding Netmon to aid in debugging

August 26, 2010, 3:03 pm

≫ Next: Debug Sleuth at Work: Hung Server…..Mystery of the unprocessed SMB work item.

≪ Previous: We Are Hiring - Windows Server Escalation Engineer

A walk-through of creating a Netmon parser in the context of a real case

As is obvious to frequent readers of our blog, our team logs a lot of time in our debugger of choice (for some windbg, for others kd). However, a debugger is not always the best tool for the job, and sometimes even the best tool has limitations. I found this to especially true when working a few Internet Printing Protocol (IPP) cases recently.

Probably the biggest challenge of many IPP cases is the mixed environments you usually find IPP running in. The benefit customers see in IPP over other print providers is it works natively or with minimal configuration on Windows, Mac, and Linux clients. This makes it popular in places like college computer labs, where there isn’t one standard client system. Unfortunately, this also means that we can’t really debug both sides of the communication when something goes wrong.

In a recent case, a customer was having problems printing from their Windows Vista clients to a Linux Common Unix Printing System (CUPS) IPP server. If the CUPS server was set to allow anonymous connections, everything worked perfectly. When the administrator enabled authentication, he found that most print jobs failed to print. After a bit more testing, he found that small jobs (perhaps a page of printed text) worked fine, but larger, multi-page documents failed.

For situations like this, I prefer to use a network trace to get a feeling for where the problem is occurring. The problem was – IPP wasn’t one of the protocols built in to Netmon (and I find Wireshark’s IPP parser to not always work well – especially with Windows clients/servers). I decided that the amount of time it would take to decode the traffic by hand could be better spent creating a Netmon IPP parser that I could use every time I ran into one of these issues.

One of the great things about Netmon is you can view the source of every included parser. This was a big help, as I hadn’t written a parser before. [Note: all steps noted are written using Netmon 3.4.There might be slight differences in Netmon 3.3.] To do this, open Netmon and click the Parsers tab. Under Object View, expand parser files and double click any of the .npl files. The source will appear on the right.

The language for Netmon parsers is similar to C++, with a limited set of statements. These are all documented in the Netmon help file, but the ones I found useful are described below. To begin, I started by defining a number of tables. The basic idea of a table is to provide a way to convert a value to a string. For example, one field in an IPP packet is the status of a printer, which is represented by an integer. In order to allow Netmon to show printer states in a readable form, I created a table to convert the values as seen in Figure 1 below.

Table IPPPrinterState //2911-4.4.11

{

switch(value)

{

case 3 : "idle";

case 4 : "processing";

case 5 : "stopped";

default : "Unknown Code";

}

Figure 1: Netmon Table

Each table is defined with the Table keyword, followed by a name for the table. It may optionally be followed by a list of parameters, which I’ll use later. In this case, I added a comment that specified which RFC and section this information comes from. A table consists of a switch statement with a case for each value, and a default for all other cases, much like other programming languages. I created tables like IPPPrinterState for each field that could be represented in an IPP packet from information I found in each of IPP’s RFCs.

Once the tables were complete, I moved on to creating the Protocol portion of the parser. This section of the code provides the logic that iterates through a packet and calls the tables for the appropriate data. This section starts with either the RegisterBefore or RegisterAfter keyword. These are used to determine how your parser is called. Essentially, Netmon takes all of the parsers it has, and compiles them into one continuous binary (.npb) and registration tells Netmon where your parser fits. For my case, I used the following registration code.

[ RegisterAfter (HTTPPayloadData.OCSP, Ipp, Property.HTTPContentType.Contains("application/ipp")) ]

This tells Netmon that, when compiling the parser, it should insert my code right after the code for the OCSP protocol in its HTTPPayloadData parser, my protocol should be called IPP, and it should enter my code path if the HTTP Payload is of content type “application/ipp”. This allows my parser to work a bit differently than the Wireshark IPP parser – Wireshark uses a port number (631) to identify IPP traffic, whereas my code looks at HTTP content types. The advantage of this, for me, is that Windows servers use port 80 for IPP by default, not 631, so in cases with a Windows IPP server, this parser should correctly identify the packets. You may be wondering how or why I chose to register after OCSP. Basically, I knew I needed my code to be registered in the section of code where HTTP does its payload content type processing. So I opened up HTTP’s parser, and searched for the content type analysis. OCSP was the first protocol I found in HTTP’s content type logic, so I used that as the place to insert my protocol.

After the registration comes the Protocol statement. I chose the following.

Protocol IPP = FormatString("Status/OpCode = %s", IPPOperationsSupported(OpID))

This names my protocol IPP and specifies that I want the description of the protocol to display the IPP status code. This way, a user doesn’t need to drill down to find out if this is a print job or a printer status request. You’ll notice FormatString is a function in Netmon that is similar to printf. In this case, I am passing a variable (OpID, which is defined lower in my code) to my IPPOperationsSupported table to determine what this OpCode means. Before I had a parser, I would need to look up the operations supported values in the IPP RFC for each packet I examined.

Next is the body of the protocol. Basically, this consists of a series of fields (like variables) that define how a packet is laid out. Creating a field is similar to declaring a variable in C++. You start by choosing a data type that matches the size of the data in the packet and provide a name for that field. For example, Figure 2 shown below contains the first seven lines of my Protocol.

struct Version = FormatString("%i.%i", Major, Minor)

{

INT8 Major;

INT8 Minor;

}

INT16 OpId = FormatString("%s (0x%X)", IPPOperationsSupported(this), this);

INT32 RequestId;

Figure 2: Code in the protocol block

The IPP specification states that all packets begin with two 8-bit values, the first value specifies the major protocol version in use, and the second value specifies the minor. In this case, I wrapped both in a struct so Netmon will display them as “Version: 1.0”, instead of separately as “Major: 1” “Minor: 0” on two lines. After the version is a 16-bit field that specifies the operation requested (for example, print-job or get-printer-state). I choose to display this value by looking it up in the IPPOperationsSupported table, then printing it as the string, followed by the hex value (e.g. “Get-Printer-Attributes (0xB)”). The ‘this’ keyword simply uses the value of the current field, which in this case is the OpId. Even though Netmon parses through the packet sequentially, this kind of use of a Field before its value is retrieved is allowed. Finally, I set the RequestId field, which is a 32-bit int value. Since this field is just a transaction ID for this conversation, I don’t need to do any formatting to it.

After that, things got a little more complicated. IPP’s structure allows for a variable number of attribute groups, each of which can contain a variable number of attributes. For example, in response to the request “Get-Printer-Attributes” from the client, the server responds with the Printer Attributes group, which contains a number of attributes like printer-state, queued-job-count, and so on. First, I needed to deal with the attribute groups in a loop until I’d read each one. IPP specifies that the end of the attribute groups is specified with the value of 0x03, so I wrote a while loop to create attribute groups until FrameData[FrameOffset] is equal to 3 (See Figure 3 below). FrameData and FrameOffset are special values provided by Netmon. FrameData is an array of the entire contents of the frame, and FrameOffset is Netmon’s current location in the packet. I use this instead of declaring a field here because referencing FrameData[FrameOffset] does not advance the parser frame location. This is important because I want to consume that value further down.

Inside that loop, I declared another struct that contains an attribute group. Much like the Protocol IPP line above, we reference a field here that will be declared lower down. This line does not advance the FrameOffset, since we don’t declare a field here. The first line of this struct is the field declaration line that finally consumes the attribute group tag. Below that is another While loop to process all attributes in the attribute group. IPP differentiates between attributes and attribute groups by making all attribute group identifiers smaller than 0x10, and all attribute identifiers 0x10 or higher. I use this as the condition for my loop. Finally, I declare an Attribute struct inside this loop. This struct is displayed after looking up how to properly print based on the Attribute Name and Value in the AttribDisplayTable.

IPP declares attributes as an 8-bit type identifier (int, bool, string, etc.), a 16-bit int specifying the attribute name’s length, the name (a string), a 16-bit in value length, and a value. Since I want to look up the value in various tables, depending on the Attribute Name, I store the Attribute Name as “AttName” in a property. This way, I can continue to reference it while processing continues. Properties are declared in brackets just above the field they will store. In my case, I prepend the ‘Post.’ evaluation keyword to the property name. This instructs Netmon to use the end result of the next line as its value, but before advancing the FrameOffset. I do this again for the actual value, which I call Val. If I did not use the Post evaluation keyword, Val would contain the unsigned int32 value of printer state, instead of the formatted string result I get by looking up printer state in its table.

While [FrameData[FrameOffset] != 0x03]

{

struct AttributeGroup = FormatString("%s", IPPTags(TagGroup))

{

INT8 TagGroup = FormatString("%s (0x%X)", IPPTags(this), this);

While [FrameData[FrameOffset] >= 0x10]

{

struct Attribute = AttribDisplayTable(AttName, Val)

{

INT8 Type = FormatString("%s (0x%X)", IPPTags(this), this);

INT16 NameLength;

[Post.AttName]

AsciiString(NameLength) AttributeName;

INT16 ValueLength;

switch(AttName)

{

case "printer-state" :

[Post.Val]

UINT32 PrinterState = FormatString("%s (0x%X)", IPPPrinterState(this),this);

...

Figure 3: Loops in protocol block

My case statements continue like printer-state for all possible attributes of IPP. At the very end of the protocol block, after I’ve closed my switch, structs, and whiles, I have one more line, which consumes any data remaining in the packet. This would contain document data if the packet was a print job, and is required so all the packet data is consumed before Netmon moves on to the next packet. That line is:

BLOB (FrameLength - FrameOffset) Data;

As you can see, it is a binary blob data type, set to the size of the frame, less our current location.

Finally, after my Protocol block, I needed to define my own data type. IPP defines its own data format to specify printer uptime, so I created a data type for it as shown below in Figure 4.

//Uptime format spec

Number UPTIME

{

Size = 4;

DisplayFormat = (this != 0) ? FormatString("%i days %02i:%02i:%02i (%i seconds)",

(this/86400), //Days

(this-((this/86400)*86400))/3600, //Hours

(this-(((this/86400)*86400)+(((this-((this/86400)*86400))/3600)*3600)))/60, //Minutes

(this%60), //Seconds

this) : "0"

}

Figure 4: Custom data type

The first line of Figure 4 specifies this will be a data type composed of numeric data named UPTIME. Size specifies how many bytes the type uses. DisplayFormat is what Netmon displays for this type. In this case, I use the x ? y : z syntax. Netmon doesn’t have if/then/else keywords, but instead uses this ternary operator. I use a special case for 0 since it seems to be a common return value in the traces I’ve looked at, and having ‘Uptime: 0 days 00:00:00 (0 seconds)’ seemed excessive.

Figures 5 and 6 below show what the result looks like in Netmon.

Figure 5: Frame Summary

Figure 6: Frame Detail

So what did the trace show? Windows attempts to send IPP requests with no authentication first, then if it receives an access denied, retries with authentication. This is by design, as the IPP server replies with the authentication types it supports in the access denied message. In the case of print jobs that are too large to fit in a single packet, IPP’s spec allows servers to either issue the access denied message as soon as it receives the first packet, or after it has received the entire job. It turns out that the IPP Print Provider on Windows was designed to send the entire job before listening for a response, so it missed the access denied message that CUPS sent after it received the first packet. See http://support.microsoft.com/kb/976988/ for related information. Want a copy of the IPP parser? It will be included in a future release of the Netmon Parser Update.

I hope this post have given you a better idea of how Netmon works, how IPP works, and helps if you ever need to write a parser for your protocol.

-Matt Burrough

↧

Debug Sleuth at Work: Hung Server…..Mystery of the unprocessed SMB work item.

September 30, 2010, 1:40 pm

≫ Next: New Joint Citrix/Microsoft Facebook Forum

≪ Previous: Expanding Netmon to aid in debugging

Hello folks, Pushkar here. I recently worked on a case where the server was hung at “Applying User Settings” during the logon phase. You might ask what’s going to be new in this post, NTDebugging has bunch of posts covering such debug scenarios J. In my defense, this case was particularly interesting because it defied the general logic that I had about processing of network connection request. I am still a newbie in this area getting my feet wet in the arena of debugging. I had to seek help from one of the GES guru Aman Jain. He is an expert in the Windows Networking Stack. He helped me in confirming my suspicion.

As we were stuck at “Applying User Settings”, the obvious path was to look at the activity of Winlogon to understand why is it stuck? I could see that there were seven logon requests which haven’t been processed yet.

1: kd> !session

Sessions on machine: 7

Valid Sessions: 0 1 2 3 4 5 6

1: kd> !process 0 0 winlogon.exe

PROCESS fffffa806e132c10

SessionId: 1 Cid: 0384 Peb: 7fffffdf000 ParentCid: 0298

DirBase: db59d000 ObjectTable: fffff88007fd0430 HandleCount: 139.

Image: winlogon.exe

<snip>

The above command returns 7 instances of Winlogon.exe, one instance per session. I picked one of the sessions as all the 7 sessions were in the same state.

As this was a Windows 2008 Server, winlogon would open an LPC Channel with Logonui.exe to process the user’s logon and then it will call into Local Session Manager (LSM.EXE) to spawn the users’ session. I could see that winlogon is waiting on message reply from Logonui.exe.

1: kd> !thread fffffa80`73d3d770

THREAD fffffa8073d3d770 Cid 0384.1e2c Teb: 000007fffffd7000 Win32Thread: 0000000000000000 WAIT: (WrLpcReply) UserMode Non-Alertable

fffffa8073d3db00 Semaphore Limit 0x1

Waiting for reply to ALPC Message fffff8801ffd9cf0 : queued at port fffffa804e9d9370 : owned by process fffffa806e134c10

Not impersonating

DeviceMap fffff880000073a0

Owning Process fffffa806e132c10 Image: winlogon.exe

Attached Process N/A Image: N/A

Wait Start TickCount 4192487 Ticks: 1882 (0:00:00:29.359)

Context Switch Count 2

UserTime 00:00:00.000

KernelTime 00:00:00.000

Win32 Start Address 0x0000000076f67cb0

Stack Init fffffa60110a9db0 Current fffffa60110a9670

Base fffffa60110aa000 Limit fffffa60110a4000 Call 0

Priority 13 BasePriority 13 PriorityDecrement 0 IoPriority 2 PagePriority 5

Child-SP RetAddr Call Site

fffffa60`110a96b0 fffff800`01ebb6fa nt!KiSwapContext+0x7f

fffffa60`110a97f0 fffff800`01eb035b nt!KiSwapThread+0x13a

fffffa60`110a9860 fffff800`01ee54e2 nt!KeWaitForSingleObject+0x2cb

fffffa60`110a98f0 fffff800`0212ef34 nt!AlpcpSignalAndWait+0x92

fffffa60`110a9980 fffff800`0212b9c6 nt!AlpcpReceiveSynchronousReply+0x44

fffffa60`110a99e0 fffff800`0211b52f nt!AlpcpProcessSynchronousRequest+0x24f

fffffa60`110a9b00 fffff800`01eb8ef3 nt!NtAlpcSendWaitReceivePort+0x19f

fffffa60`110a9bb0 00000000`76f9756a nt!KiSystemServiceCopyEnd+0x13

00000000`01f8f098 00000000`00000000 0x76f9756a

1: kd> !alpc /m fffff8801ffd9cf0

Message @ fffff8801ffd9cf0

MessageID : 0x0464 (1124)

CallbackID : 0x2844F0 (2639088)

SequenceNumber : 0x0000000B (11)

Type : LPC_REQUEST

DataLength : 0x0064 (100)

TotalLength : 0x008C (140)

Canceled : No

Release : No

ReplyWaitReply : No

Continuation : Yes

OwnerPort : fffffa80481ba9b0 [ALPC_CLIENT_COMMUNICATION_PORT]

WaitingThread : fffffa8073d3d770

QueueType : ALPC_MSGQUEUE_PENDING

QueuePort : fffffa804e9d9370 [ALPC_CONNECTION_PORT]

QueuePortOwnerProcess : fffffa806e134c10 (LogonUI.exe)

ServerThread : fffffa804f96d5a0 <----------- LPC Message Server Thread

QuotaCharged : No

CancelQueuePort : 0000000000000000

CancelSequencePort : 0000000000000000

CancelSequenceNumber : 0x00000000 (0)

ClientContext : 0000000000243d00

ServerContext : 0000000000000000

PortContext : 0000000001c27600

CancelPortContext : 0000000000000000

SecurityData : 0000000000000000

1: kd> !thread fffffa804f96d5a0

THREAD fffffa804f96d5a0 Cid 01ac.09fc Teb: 000007fffffd7000 Win32Thread: fffff900c067dd50 WAIT: (UserRequest) UserMode Non-Alertable

fffffa80516be920 NotificationEvent

Not impersonating

DeviceMap fffff880000073a0

Owning Process fffffa806e134c10 Image: LogonUI.exe

Attached Process N/A Image: N/A

Wait Start TickCount 4192487 Ticks: 1882 (0:00:00:29.359)

Context Switch Count 13 LargeStack

UserTime 00:00:00.000

KernelTime 00:00:00.000

Win32 Start Address 0x000007fefed47780

Stack Init fffffa60114e4db0 Current fffffa60114e4940

Base fffffa60114e5000 Limit fffffa60114df000 Call 0

Priority 14 BasePriority 13 PriorityDecrement 0 IoPriority 2 PagePriority 5

Child-SP RetAddr Call Site

fffffa60`114e4980 fffff800`01ebb6fa nt!KiSwapContext+0x7f

fffffa60`114e4ac0 fffff800`01eb035b nt!KiSwapThread+0x13a

fffffa60`114e4b30 fffff800`02119608 nt!KeWaitForSingleObject+0x2cb

fffffa60`114e4bc0 fffff800`01eb8ef3 nt!NtWaitForSingleObject+0x98

fffffa60`114e4c20 00000000`76f96d5a nt!KiSystemServiceCopyEnd+0x13

00000000`0444ef18 00000000`00000000 0x76f96d5a

This server thread is processing a message, but we can’t tell what it is waiting for. If this thread was finished processing a message we would have seen “WAIT: (WrLpcReceive)” for the thread wait reason. Often this type of wait is for a critical section or other synchronization object that is waiting for another thread in the process. This begged an investigation of Logonui.exe, to see what the other threads were doing.

I noticed that another Logonui.exe thread is waiting on LPC Message processing by a thread in LSM.EXE.

1: kd> !thread fffffa80`4e9e5060

THREAD fffffa804e9e5060 Cid 01ac.0438 Teb: 000007fffffd9000 Win32Thread: fffff900c06db2e0 WAIT: (WrLpcReply) UserMode Non-Alertable

fffffa804e9e53f0 Semaphore Limit 0x1

Waiting for reply to ALPC Message fffff8801ffaacf0 : queued at port fffffa804e9b6060 : owned by process fffffa806e130c10

Not impersonating

DeviceMap fffff880000073a0

Owning Process fffffa806e134c10 Image: LogonUI.exe

Attached Process N/A Image: N/A

Wait Start TickCount 4192488 Ticks: 1881 (0:00:00:29.343)

Context Switch Count 165 LargeStack

UserTime 00:00:00.000

KernelTime 00:00:00.000

Win32 Start Address 0x000007fefbaa151c

Stack Init fffffa6008d26db0 Current fffffa6008d26670

Base fffffa6008d27000 Limit fffffa6008d20000 Call 0

Priority 15 BasePriority 13 PriorityDecrement 0 IoPriority 2 PagePriority 5

Child-SP RetAddr Call Site

fffffa60`08d266b0 fffff800`01ebb6fa nt!KiSwapContext+0x7f

fffffa60`08d267f0 fffff800`01eb035b nt!KiSwapThread+0x13a

fffffa60`08d26860 fffff800`01ee54e2 nt!KeWaitForSingleObject+0x2cb

fffffa60`08d268f0 fffff800`0212ef34 nt!AlpcpSignalAndWait+0x92

fffffa60`08d26980 fffff800`0212b9c6 nt!AlpcpReceiveSynchronousReply+0x44

fffffa60`08d269e0 fffff800`0211b52f nt!AlpcpProcessSynchronousRequest+0x24f

fffffa60`08d26b00 fffff800`01eb8ef3 nt!NtAlpcSendWaitReceivePort+0x19f

fffffa60`08d26bb0 00000000`76f9756a nt!KiSystemServiceCopyEnd+0x13

00000000`0272ea78 00000000`00000000 0x76f9756a

At this point I decided to follow the LPC message chain trail to see where it ends and who is ultimately blocking it.

1: kd> !alpc /m fffff8801ffaacf0

Message @ fffff8801ffaacf0

MessageID : 0x0474 (1140)

CallbackID : 0x2844F6 (2639094)

SequenceNumber : 0x0000000D (13)

Type : LPC_REQUEST

DataLength : 0x0058 (88)

TotalLength : 0x0080 (128)

Canceled : No

Release : No

ReplyWaitReply : No

Continuation : Yes

OwnerPort : fffffa804ea03c80 [ALPC_CLIENT_COMMUNICATION_PORT]

WaitingThread : fffffa804e9e5060

QueueType : ALPC_MSGQUEUE_PENDING

QueuePort : fffffa804e9b6060 [ALPC_CONNECTION_PORT]

QueuePortOwnerProcess : fffffa806e130c10 (lsm.exe)

ServerThread : fffffa805180c700 <---- LPC Server thread for this message in LSM.EXE

QuotaCharged : No

CancelQueuePort : 0000000000000000

CancelSequencePort : 0000000000000000

CancelSequenceNumber : 0x00000000 (0)

ClientContext : 0000000001cf0e90

ServerContext : 0000000000000000

PortContext : 0000000001dc9df0

CancelPortContext : 0000000000000000

SecurityData : 0000000000000000

I noticed that the server thread which is supposed to process the LPC message for above thread is itself waiting on a message reply from a thread in Svchost.exe. LSM had bunch of other threads which were waiting on LPC message processing from server threads in Svchost.exe. I dumped out all the threads and their LPC message structure to locate the server threads in Svchost.exe which seems to be doing some activity and found that only for the thread below the server thread had some interesting information as other LPC Server threads in svchost.exe were Notification Timers.

1: kd> !thread fffffa80`50910060

THREAD fffffa8050910060 Cid 02e0.1a70 Teb: 000007fffffda000 Win32Thread: 0000000000000000 WAIT: (WrLpcReply) UserMode Non-Alertable

fffffa80509103f0 Semaphore Limit 0x1

Waiting for reply to ALPC Message fffff8801fe00cf0 : queued at port fffffa804e876b60 : owned by process fffffa804ea4fc10

Not impersonating

DeviceMap fffff880000073a0

Owning Process fffffa806e130c10 Image: lsm.exe

Attached Process N/A Image: N/A

Wait Start TickCount 3187894 Ticks: 1006475 (0:04:21:41.110)

Context Switch Count 97

UserTime 00:00:00.000

KernelTime 00:00:00.000

Win32 Start Address 0x000007fefed47780

Stack Init fffffa60125b1db0 Current fffffa60125b1670

Base fffffa60125b2000 Limit fffffa60125ac000 Call 0

Priority 9 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5

Kernel stack not resident.

Child-SP RetAddr Call Site

fffffa60`125b16b0 fffff800`01ebb6fa nt!KiSwapContext+0x7f

fffffa60`125b17f0 fffff800`01eb035b nt!KiSwapThread+0x13a

fffffa60`125b1860 fffff800`01ee54e2 nt!KeWaitForSingleObject+0x2cb

fffffa60`125b18f0 fffff800`0212ef34 nt!AlpcpSignalAndWait+0x92

fffffa60`125b1980 fffff800`0212b9c6 nt!AlpcpReceiveSynchronousReply+0x44

fffffa60`125b19e0 fffff800`0211b52f nt!AlpcpProcessSynchronousRequest+0x24f

fffffa60`125b1b00 fffff800`01eb8ef3 nt!NtAlpcSendWaitReceivePort+0x19f

fffffa60`125b1bb0 00000000`76f9756a nt!KiSystemServiceCopyEnd+0x13

00000000`0205e538 00000000`00000000 0x76f9756a

I looked at the server thread for this LPC message and that was also waiting on a notification event doing nothing. It again begged an investigation of the state of Svchost.exe to find out why the server thread hasn’t replied yet?

1: kd> !alpc /m fffff8801fe00cf0

Message @ fffff8801fe00cf0

MessageID : 0x03C4 (964)

CallbackID : 0x2140C2 (2179266)

SequenceNumber : 0x00000021 (33)

Type : LPC_REQUEST

DataLength : 0x00C0 (192)

TotalLength : 0x00E8 (232)

Canceled : No

Release : No

ReplyWaitReply : No

Continuation : Yes

OwnerPort : fffffa804ef15d30 [ALPC_CLIENT_COMMUNICATION_PORT]

WaitingThread : fffffa8050910060

QueueType : ALPC_MSGQUEUE_PENDING

QueuePort : fffffa804e876b60 [ALPC_CONNECTION_PORT]

QueuePortOwnerProcess : fffffa804ea4fc10 (svchost.exe)

ServerThread : fffffa8051435940

QuotaCharged : No

CancelQueuePort : 0000000000000000

CancelSequencePort : 0000000000000000

CancelSequenceNumber : 0x00000000 (0)

ClientContext : 00000000021a0f60

ServerContext : 0000000000000000

PortContext : 00000000033834d0

CancelPortContext : 0000000000000000

SecurityData : 0000000000000000

1: kd> !thread fffffa8051435940

THREAD fffffa8051435940 Cid 0574.1ea0 Teb: 000007fffffd3000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Alertable

fffffa601086d3e8 SynchronizationEvent

IRP List:

fffffa80514558b0: (0006,03a0) Flags: 00000884 Mdl: 00000000

Impersonation token: fffff8801fc4b050 (Level Impersonation)

DeviceMap fffff8800e7c50e0

Owning Process fffffa804ea4fc10 Image: svchost.exe

Attached Process N/A Image: N/A

Wait Start TickCount 3187932 Ticks: 1006437 (0:04:21:40.517)

Context Switch Count 111

UserTime 00:00:00.015

KernelTime 00:00:00.000

Win32 Start Address 0x000007fefed47780

Stack Init fffffa601086ddb0 Current fffffa601086d040

Base fffffa601086e000 Limit fffffa6010868000 Call 0

Priority 10 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5

Child-SP RetAddr Call Site

fffffa60`1086d080 fffff800`01ebb6fa nt!KiSwapContext+0x7f

fffffa60`1086d1c0 fffff800`01eb035b nt!KiSwapThread+0x13a

fffffa60`1086d230 fffff800`020d2a2e nt!KeWaitForSingleObject+0x2cb

fffffa60`1086d2c0 fffff800`020d2a87 nt!FsRtlCancellableWaitForMultipleObjects+0x62

fffffa60`1086d320 fffffa60`015da75a nt!FsRtlCancellableWaitForSingleObject+0x27

fffffa60`1086d360 fffffa60`015d8588 mup!MupLocateUncProviderForPath+0x40a

fffffa60`1086d4a0 fffffa60`015d8130 mup!MupiCallUncProvider+0xe8

fffffa60`1086d510 fffffa60`015d90ef mup!MupStateMachine+0x120

fffffa60`1086d560 fffffa60`00f31e91 mup!MupCreate+0x2c3

fffffa60`1086d5e0 fffffa60`00f4e26c fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x211

fffffa60`1086d650 fffff800`021267d9 fltmgr!FltpCreate+0x25d

fffffa60`1086d700 fffff800`0211ecb2 nt!IopParseDevice+0x5f9

fffffa60`1086d8a0 fffff800`0211f842 nt!ObpLookupObjectName+0x593

fffffa60`1086d9b0 fffff800`02124e47 nt!ObOpenObjectByName+0x2f2

fffffa60`1086da80 fffff800`0212ecc8 nt!IopCreateFile+0x287

fffffa60`1086db20 fffff800`01eb8ef3 nt!NtCreateFile+0x78

fffffa60`1086dbb0 00000000`76f9726a nt!KiSystemServiceCopyEnd+0x13

00000000`2674d738 00000000`00000000 0x76f9726a

Bingo! I found two threads in Svchost.exe which showed some activity and were blocked for over 4 hours on a UNC Path access. From the IRP which was issued by the thread above I found out that it was attempting to contact one of the DCs.

1: kd> !irp 0xfffffa80`514558b0

Irp is active with 4 stacks 3 is current (= 0xfffffa8051455a10)

No Mdl: No System Buffer: Thread fffffa8051435940: Irp stack trace.

cmd flg cl Device File Completion-Context

[ 0, 0] 0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000

[ 0, 0] 0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000

>[ 0, 0] 0 e0 fffffa8031d1a840 fffffa805145aec0 fffffa6000f332b0-fffffa8047b6c930 Success Error Cancel

\FileSystem\Mup fltmgr!FltpSynchronizedOperationCompletion

Args: fffffa601086d848 01000040 00030000 00000000

[ 0, 0] 0 0 fffffa8031d1b880 fffffa805145aec0 00000000-00000000

\FileSystem\FltMgr

Args: fffffa601086d848 01000040 00030000 00000000

1: kd> dt nt!_FILE_OBJECT fffffa805145aec0 FileName

+0x058 FileName : _UNICODE_STRING "\DC005.contoso.com\PIPE\lsarpc"

For the details on MUP you can refer to Support for UNC Naming and MUP.

At first my thought was that something is wrong with the network stack because we are waiting on domain controller to respond. The thought was that either we had a networking issue or the Domain Controller wasn’t responding, but redirector would have timed out as it won’t wait for over 4 hours for an unresponsive connection. I started digging into the source code and found that we build a query context in MupLocateUncProviderForPath () function and issue another IRP which is sent down the I/O Stack. I searched for other IRPs which was issued by the above thread and found two IRPs.

1: kd> !irpfind 0 0 thread fffffa8051435940

Looking for IRPs with thread == fffffa8051435940

Scanning large pool allocation table for Tag: Irp? (fffffa804ea83000 : fffffa804ed83000)

Searching NonPaged pool (fffffa8030605000 : fffffa8c27e00000) for Tag: Irp?

Irp [ Thread ] irpStack: (Mj,Mn) DevObj [Driver] MDL Process

fffffa80514558b0 [fffffa8051435940] irpStack: ( 0, 0) fffffa8031d1a840 [ \FileSystem\Mup]

fffffa8073bc65e0 [fffffa8051435940] irpStack: ( e, 0) fffffa804e842610 [ \FileSystem\mrxsmb]

1: kd> !irp fffffa8073bc65e0

Irp is active with 2 stacks 1 is current (= 0xfffffa8073bc66b0)

No Mdl: No System Buffer: Thread fffffa8051435940: Irp stack trace.

cmd flg cl Device File Completion-Context

>[ e, 0] 0 e1 fffffa804e842610 fffffa804e843f20 fffffa60015d4024-fffffa601086d3b0 Success Error Cancel pending

\FileSystem\mrxsmb mup!MupiQueryPathCompletionRoutine

Args: 00000004 00000050 00140193 fffffa601086d408

[ 0, 0] 0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000

This is the IRP that MUP has issued an IOCTL IOCTL_REDIR_QUERY_PATH to the redirector. If the status returned by the driver is anything other than STATUS_PENDING then it will wait by calling FsRtlCancellableWaitForSingleObject.

The 0xfffffa8051435940 thread doesn’t have any redirector function in its stack, which made me suspicious that there could be some other thread which is already working on similar operation and redirector is currently busy processing that request. I started looking for a worker thread which would be calling the redirector.

1: kd> !stacks 0 mrxsmb

Proc.Thread .Thread Ticks ThreadState Blocker

[fffffa804e9c7c10 svchost.exe]

248.000e84 fffffa804ee72bb0 ffe0d0ad Blocked nt!FsRtlCancellableWaitForMultipleObjects+0x62

[fffffa804e9e12e0 svchost.exe]

404.000f18 fffffa804edad4f0 ffe03e7e Blocked nt!FsRtlCancellableWaitForMultipleObjects+0x62

The above command will parse through all the kernel stacks for all the process and dump out one line summary info for the threads where it finds reference to that module in its stack. I have snipped the output to keep things in context.

One of the sample threads where redirector is in action is shown below.

1: kd> !thread fffffa804ee72bb0

THREAD fffffa804ee72bb0 Cid 0248.0e84 Teb: 000007fffffa8000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Alertable

fffffa8048f6c6b8 SynchronizationEvent

IRP List:

fffffa8073ac3980: (0006,03a0) Flags: 00000884 Mdl: 00000000

fffffa807257fc60: (0006,03a0) Flags: 00000884 Mdl: 00000000

fffffa80511627b0: (0006,03a0) Flags: 00000884 Mdl: 00000000

Impersonation token: fffff8800d515c40 (Level Impersonation)

Owning Process fffffa804e9c7c10 Image: svchost.exe

Attached Process N/A Image: N/A

Wait Start TickCount 2043731 Ticks: 2150638 (0:09:19:10.167)

Context Switch Count 893

UserTime 00:00:00.031

KernelTime 00:00:00.109

Win32 Start Address 0x000007fefb96c534

Stack Init fffffa60125a1db0 Current fffffa60125a0ab0

Base fffffa60125a2000 Limit fffffa601259c000 Call 0

Priority 10 BasePriority 1 PriorityDecrement 0 IoPriority 2 PagePriority 5

Child-SP RetAddr Call Site

fffffa60`125a0af0 fffff800`01ebb6fa nt!KiSwapContext+0x7f

fffffa60`125a0c30 fffff800`01eb035b nt!KiSwapThread+0x13a

fffffa60`125a0ca0 fffff800`020d2a2e nt!KeWaitForSingleObject+0x2cb

fffffa60`125a0d30 fffff800`020d2a87 nt!FsRtlCancellableWaitForMultipleObjects+0x62

fffffa60`125a0d90 fffffa60`0f625c14 nt!FsRtlCancellableWaitForSingleObject+0x27

fffffa60`125a0dd0 fffffa60`0f634e9e rdbss!RxWaitForStableCondition+0x11c

fffffa60`125a0e10 fffffa60`0f63599f rdbss!RxFindOrCreateConnections+0x572

fffffa60`125a0ef0 fffffa60`0f627318 rdbss!RxConstructVirtualNetRoot+0xb7

fffffa60`125a0f90 fffffa60`0f63815a rdbss!RxFindOrConstructVirtualNetRoot+0x594

fffffa60`125a1100 fffffa60`0f627eee rdbss!RxCreateTreeConnect+0x13e

fffffa60`125a1190 fffffa60`0f6041f6 rdbss!RxCommonCreate+0x20a

fffffa60`125a1250 fffffa60`0f62568a rdbss!RxFsdCommonDispatch+0x786

fffffa60`125a1340 fffffa60`1031f2cf rdbss!RxFsdDispatch+0x21a

fffffa60`125a13b0 fffffa60`015d85f9 mrxsmb!MRxSmbFsdDispatch+0xbf

fffffa60`125a13f0 fffffa60`015d8130 mup!MupiCallUncProvider+0x159

fffffa60`125a1460 fffffa60`015d90ef mup!MupStateMachine+0x120

fffffa60`125a14b0 fffffa60`00f4e342 mup!MupCreate+0x2c3

fffffa60`125a1530 fffff800`021267d9 fltmgr!FltpCreate+0x333

fffffa60`125a15e0 fffff800`0211ecb2 nt!IopParseDevice+0x5f9

fffffa60`125a1780 fffff800`0211f842 nt!ObpLookupObjectName+0x593

fffffa60`125a1890 fffff800`02124e47 nt!ObOpenObjectByName+0x2f2

fffffa60`125a1960 fffff800`0212ecc8 nt!IopCreateFile+0x287

fffffa60`125a1a00 fffff800`01eb8ef3 nt!NtCreateFile+0x78

fffffa60`125a1a90 fffff800`01eb9400 nt!KiSystemServiceCopyEnd+0x13

fffffa60`125a1c98 fffffa60`0f664a5e nt!KiServiceLinkage

<snip>

The above thread is waiting for the condition of the remote connection to become stable and that’s the reason it called the function RxWaitForStableCondition(), which accepts RX_CONTEXT as its first argument and the reason for why we entered into the wait as second argument. As we can see that the thread above was waiting for over 9 hours and 19 minutes.

For additional reading you can go through RX_CONTEXT and IRP Management.

The unassembled code for the function below is to find out where we store the pointer to the RX_CONTEXT structure. You may remember about the challenges of x64 debugging from this post.

1: kd> .frame /r 05

05 fffffa60`11e42dd0 fffffa60`0f634e9e rdbss!RxWaitForStableCondition+0x11c

rax=0000000000000000 rbx=fffffa804f4de010 rcx=0000000000000000

rdx=0000000000000000 rsi=0000000000000001 rdi=0000000000000000

rip=fffffa600f625c14 rsp=fffffa6011e42dd0 rbp=fffffa805113a490

r8=0000000000000000 r9=0000000000000000 r10=0000000000000000

r11=0000000000000000 r12=fffffa805113a4a0 r13=fffffa6011e42fd8

r14=fffffa804f4de010 r15=0000000000000000

iopl=0 nv up di pl nz na pe nc

cs=0000 ss=0000 ds=0000 es=0000 fs=0000 gs=0000 efl=00000000

rdbss!RxWaitForStableCondition+0x11c:

fffffa60`0f625c14 85c0 test eax,eax

1: kd> u rdbss!RxWaitForStableCondition

rdbss!RxWaitForStableCondition:

fffffa60`0f625af8 488bc4 mov rax,rsp

fffffa60`0f625afb 48895808 mov qword ptr [rax+8],rbx

fffffa60`0f625aff 48896810 mov qword ptr [rax+10h],rbp

fffffa60`0f625b03 48897018 mov qword ptr [rax+18h],rsi

fffffa60`0f625b07 48897820 mov qword ptr [rax+20h],rdi

fffffa60`0f625b0b 4154 push r12

fffffa60`0f625b0d 4883ec30 sub rsp,30h

fffffa60`0f625b11 33ff xor edi,edi

1: kd> u

rdbss!RxWaitForStableCondition+0x1b:

fffffa60`0f625b13 833a03 cmp dword ptr [rdx],3

fffffa60`0f625b16 418af1 mov sil,r9b

fffffa60`0f625b19 4d8be0 mov r12,r8

fffffa60`0f625b1c 488bea mov rbp,rdx

fffffa60`0f625b1f 488bd9 mov rbx,rcx <--------- RX_CONTEXT is in RBX register

fffffa60`0f625b22 7c07 jl rdbss!RxWaitForStableCondition+0x33 (fffffa60`0f625b2b)

fffffa60`0f625b24 33c0 xor eax,eax

fffffa60`0f625b26 e968010000 jmp rdbss!RxWaitForStableCondition+0x19b (fffffa60`0f625c93)

1: kd> !pool fffffa804f4de010 2 <---- [Argument 1] Here is pointer to RX_CONTEXT Structure

Pool page fffffa804f4de010 region is Nonpaged pool

*fffffa804f4de000 size: 670 previous size: 0 (Allocated) *RxIr

Pooltag RxIr : RDBSS RxContext, Binary : rdbss.sys

1: kd> !pool fffffa805113a490 2 <---- [Argument 2] Here is the Pointer to the Reason for Wait

Pool page fffffa805113a490 region is Nonpaged pool

*fffffa805113a3b0 size: 280 previous size: 10 (Allocated) *RxSc

Pooltag RxSc : RDBSS SrvCall, Binary : rdbss.sys

1: kd> dc fffffa805113a490 L1

fffffa80`5113a490 00000001 <---------------- Transport is in Transition state

We have bunch of RX_CONTEXT describing the IRPs pending on redirector for the transport of the remote connection to stabilize.

Next obvious thought that came to my mind was to investigate the list where we store the IRPs which have been issued by the redirector for the transport driver to process and understand the state of those IRPs and understand why we are not progressing further? RxCeSend is the API which submits the IRPs to the Transport driver which is maintained in the list below. For more information about Redirector related APIs you can refer Connection Engine Management.

1: kd> x rdbss!RxIrpsList

fffffa60`0f620110 rdbss!RxIrpsList = <no type information>

The rdbss!RxIrpsList is a pointer to the structure of type LIST_ENTRY.

1: kd> dq fffffa60`0f620110 L2

fffffa60`0f620110 fffffa80`510fd0d0 fffffa80`4edb5590

1: kd> dq fffffa80`510fd0d0 L2 <---------------- LIST_ENTRY.FLINK

fffffa80`510fd0d0 fffffa80`50d2e300 fffffa60`0f620110

1: kd> dq fffffa80`4edb5590 L2 <---------------- LIST_ENTRY.BLINK

fffffa80`4edb5590 fffffa60`0f620110 fffffa80`4ff450d0

This being a x64 windows each entry in this list is 0x30 bytes, this can change with OS version or driver updates, so don’t rely on this list or any of the offset that I referenced here in your driver code. Entry in the first column of this list is a pointer to the list entry. At offset 10 we store the pointer to the IRP. Add 0x10 bytes to get the IRP.

1: kd> dl 0xfffffa80`510fd0d0

fffffa80`510fd0d0 fffffa80`50d2e300 fffffa60`0f620110

fffffa80`510fd0e0 fffffa80`7274e5a0 00000000`00000000

fffffa80`50d2e300 fffffa80`50ee4490 fffffa80`510fd0d0

fffffa80`50d2e310 fffffa80`50f16160 00000000`00000000

fffffa80`50ee4490 fffffa80`727f1140 fffffa80`50d2e300

fffffa80`50ee44a0 fffffa80`51491760 00000000`00000000

fffffa80`727f1140 fffffa80`73a95860 fffffa80`50ee4490

fffffa80`727f1150 fffffa80`511b9c60 00000000`00000000

fffffa80`73a95860 fffffa80`73a8e3e0 fffffa80`727f1140

fffffa80`73a95870 fffffa80`73a77c60 00000000`00000000

fffffa80`73a8e3e0 fffffa80`73a9c290 fffffa80`73a95860

fffffa80`73a8e3f0 fffffa80`73a46a30 00000000`00000000

fffffa80`73a9c290 fffffa80`725ca490 fffffa80`73a8e3e0

<Snip>

1: kd> !irp poi(fffffa80`510fd0d0+10)

Irp is active with 2 stacks 2 is current (= 0xfffffa807274e6b8)

No Mdl: No System Buffer: Thread 00000000: Irp stack trace.

cmd flg cl Device File Completion-Context

[ 0, 0] 0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000

>[ f, 3] 0 e1 fffffa804b1fd8d0 fffffa8050f012f0 fffffa601030df70-fffffa805119fda0 Success Error Cancel pending

\Driver\Smb mrxsmb!RxTdiAsynchronousConnectCompletion

Args: 00000000 fffffa8050d127e0 00000000 fffffa601031b6f0

1: kd> !irp poi(fffffa80`50d2e300+10)

Irp is active with 2 stacks 2 is current (= 0xfffffa8050f16278)

No Mdl: System buffer=fffffa8050d59c88: Thread 00000000: Irp stack trace.

cmd flg cl Device File Completion-Context

[ 0, 0] 0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000

>[ e, 0] 0 e1 fffffa804b1fd8d0 fffffa8050f012f0 fffffa60103063a4-fffffa8050d59c60 Success Error Cancel pending

\Driver\Smb mrxsmb!RxTdiAsynchronousRequestCompletion

Args: 00000000 00000030 0021c0e8 00000000

1: kd> !irp poi(fffffa80`50ee4490+10)

Irp is active with 2 stacks 2 is current (= 0xfffffa8051491878)

No Mdl: System buffer=fffffa805115dcb8: Thread 00000000: Irp stack trace.

cmd flg cl Device File Completion-Context

[ 0, 0] 0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000

>[ e, 0] 0 e1 fffffa804b1fd8d0 fffffa8050f012f0 fffffa60103063a4-fffffa805115dc90 Success Error Cancel pending

\Driver\Smb mrxsmb!RxTdiAsynchronousRequestCompletion

Args: 00000000 00000030 0021c0e8 00000000

All these IRPs are waiting on SMB Transport Driver (responsible for handling TCP/IP port 445 related traffic) for further processing. The obvious feeling was to find out if there are threads which are making call to SMB function to see what they are they doing?

Note: In Windows 2008 R2 and Windows 7 onwards redirector doesn’t use SMB as the transport driver directly.

I searched through the dump but I couldn’t find a single thread which was calling any of the SMB function. This is the point where it gets more interesting. Why we don’t have any thread which is calling SMB function when we have bunch of IRPs pending on that driver?

This was the time for me to collaborate with the Network Stack Guru Aman Jain to get some insight to figure out the answer to my question.

It turned out that SMB driver will create a work item and it will use the System Delayed Worker Queue to queue the work item that it created and will populate the routine in the work Item structure which is supposed to process the work item once the worker thread picks up the work item.

1: kd> x nt!ExWorkerQueue

fffff800`01ff08a0 nt!ExWorkerQueue = <no type information>

Above pointer is the list head for the 3 queues that system has namely the “Critical WorkQueue” , “Delayed WorkQueue” and the “HyperCritical WorkQueue” as see in the output of !exqueue command in the debugger. Each entry in this list is 0x58 bytes long and is of type _KQUEUE.

Structure referenced below decides how a various work queues are enumerated. This structure is documented in this MSDN article.

typedef enum _WORK_QUEUE_TYPE {

CriticalWorkQueue = 0,

DelayedWorkQueue = 1,

HyperCriticalWorkQueue = 2,

MaximumWorkQueue = 3

} WORK_QUEUE_TYPE;

Critical WorkQueue

1: kd> dt nt!_KQUEUE fffff800`01ff08a0

+0x000 Header : _DISPATCHER_HEADER

+0x018 EntryListHead : _LIST_ENTRY [ 0xfffff800`01ff08b8 - 0xfffff800`01ff08b8 ]

+0x028 CurrentCount : 0

+0x02c MaximumCount : 0x10

+0x030 ThreadListHead : _LIST_ENTRY [ 0xfffffa80`30a4da08 - 0xfffffa80`30a39d68 ]

Delayed WorkQueue

1: kd> dt nt!_KQUEUE fffff800`01ff08a0+58

+0x000 Header : _DISPATCHER_HEADER

+0x018 EntryListHead : _LIST_ENTRY [ 0xfffffa80`50226f00 - 0xfffffa80`5183f9f0 ]

+0x028 CurrentCount : 0

+0x02c MaximumCount : 0x10

+0x030 ThreadListHead : _LIST_ENTRY [ 0xfffffa80`30a398d8 - 0xfffffa80`51157d68 ]

HyperCritical WorkQueue

1: kd> dt nt!_KQUEUE fffff800`01ff08a0+58+58

+0x000 Header : _DISPATCHER_HEADER

+0x018 EntryListHead : _LIST_ENTRY [ 0xfffff800`01ff0968 - 0xfffff800`01ff0968 ]

+0x028 CurrentCount : 0

+0x02c MaximumCount : 0x10

+0x030 ThreadListHead : _LIST_ENTRY [ 0xfffffa80`30a4e1f8 - 0xfffffa80`30a4e1f8 ]

1: kd> dd fffff80001ff08a0+58+40 L1

fffff800`01ff0938 00000010 <----------- We created 0x10 (16) dynamic threads

We create additional dynamic threads as in the event when the default number of threads in a particular work queue is unable to handle the Work Items being queued. In this case we created 16 additional threads, which is the maximum number of dynamic thread which can be created.

1: kd> dd fffff80001ff08a0+58+4c L1

fffff800`01ff0944 00004667

1: kd> ? 00004667

Evaluate expression: 18023 = 00000000`00004667 <-------- 18,023 work Items pending for processing

I could see that there are work items pending in the Delayed WorkQueue because of the following relation which is not present in any other WorkQueue that we dumped.

nt!_KQUEUE->EntryListHead.Flink != nt!_QUEUE->EntryListHead.Blink

For details about Work Item Processing and Work Items you can refer following docs:

System Worker Threads

IO_WORKITEM

IoSizeofWorkItem

The next obvious steps involved dumping all work items pending in the list to see if I have work items related to SMB driver? And if the Work Items are present for SMB then why they haven’t been picked up by the worker thread?

1: kd> !list "-t nt!_LIST_ENTRY.Flink -e -x\"dq poi(@$extret) L1; dps poi(@$extret)+0x20 L1\" fffffa80`50226f00"

<Snip>