Why do some native threads have a stack trace with

2019-06-09 21:05发布

问题:

I have a C# .NET 4.5 application heavily using the Task Parallel Library that eventually ends up starved for threads after days of operation.

When I grab a HANG dump from AdPlus and look at the threads via Visual Studio, I see 43 threads with no apparent origin in my code:

ntdll.dll!_NtWaitForSingleObject@12()  + 0x15 bytes 
ntdll.dll!_NtWaitForSingleObject@12()  + 0x15 bytes 
kernel32.dll!@BaseThreadInitThunk@12()  + 0x12 bytes    
ntdll.dll!___RtlUserThreadStart@8()  + 0x27 bytes   
ntdll.dll!__RtlUserThreadStart@8()  + 0x1b bytes    

Why do these threads show no managed origin in their stack trace?

回答1:

All threads in a given process, even TPL threads have this startup procedure. When you start a thread running, eventually the CLR calls the OS to start a thread. What you're looking at is the functions that the thread executes at startup. If you suspend any managed process, you'll see that at the bottom of the stack there are unmanaged calls. The reason you don't see the managed start procedure, is that each thread gets it's own stack, created by the OS when it creates the thread.

For example, running the following:

for (int i = 0; i < 10; i++)
{
    Thread t = new Thread(new ThreadStart(()=>Thread.Sleep(100000)));
    t.Start();
}
Console.ReadKey();

then breaking into the process using WinDbg, and looking at one of the sleeping threads, gives a call stack that looks like this (All of the threads have the same two functions at the bottom, I'm just dumping one for this exercise.):

0:012> !dumpstack
OS Thread Id: 0x3694 (12)
Current frame: ntdll!ZwDelayExecution+0xa
Child-SP         RetAddr          Caller, Callee
000000001dc8ea70 000007fefd1c1203 KERNELBASE!SleepEx+0xab, calling ntdll!NtDelayExecution
000000001dc8eae0 000007fefd1c38fb KERNELBASE!SleepEx+0x12d, calling ntdll!RtlActivateActivationContextUnsafeFast
000000001dc8eb10 000007fed860a888 clr!CExecutionEngine::ClrSleepEx+0x29, calling KERNEL32!SleepExStub
000000001dc8eb40 000007fed874d483 clr!Thread::UserSleep+0x7c, calling clr!ClrSleepEx
000000001dc8eba0 000007fed874d597 clr!ThreadNative::Sleep+0xb7, calling clr!Thread::UserSleep
[... removed some frames for clarity ...]
000000001dc8f6f0 000007fed874fcb6 clr!Thread::intermediateThreadProc+0x7d
000000001dc8faf0 000007fed874fc9f clr!Thread::intermediateThreadProc+0x66, calling clr!alloca_probe
000000001dc8fb30 0000000077195a4d KERNEL32!BaseThreadInitThunk+0xd
000000001dc8fb60 00000000773cb831 ntdll!RtlUserThreadStart+0x1d

For reference, this is the Thread object wrapping the thread that we dumped the stack of:

0:012> !do 2a23e08
Name:        System.Threading.Thread
MethodTable: 000007fed76522f8
EEClass:     000007fed7038200
Size:        96(0x60) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
000007fed763eca8  4000765        8 ....Contexts.Context  0 instance 0000000000000000 m_Context
000007fed765a958  4000766       10 ....ExecutionContext  0 instance 0000000000000000 m_ExecutionContext
000007fed7650e08  4000767       18        System.String  0 instance 0000000000000000 m_Name
000007fed76534a8  4000768       20      System.Delegate  0 instance 0000000000000000 m_Delegate
000007fed7655390  4000769       28 ...ation.CultureInfo  0 instance 0000000000000000 m_CurrentCulture
000007fed7655390  400076a       30 ...ation.CultureInfo  0 instance 0000000000000000 m_CurrentUICulture
000007fed76513e8  400076b       38        System.Object  0 instance 0000000000000000 m_ThreadStartArg
000007fed7654a00  400076c       40        System.IntPtr  1 instance          24a5ed0 DONT_USE_InternalThread
000007fed7653980  400076d       48         System.Int32  1 instance                2 m_Priority
000007fed7653980  400076e       4c         System.Int32  1 instance               12 m_ManagedThreadId
000007fed7658c48  400076f       50       System.Boolean  1 instance                1 m_ExecutionContextBelongsToOuterScope
000007fed7672e70  4000770      378 ...LocalDataStoreMgr  0   shared           static s_LocalDataStoreMgr
                                 >> Domain:Value  00000000005f40b0:NotInit  <<
000007fed7672df0  4000771        8 ...alDataStoreHolder  0   shared         TLstatic s_LocalDataStore
                                  >> Thread:Value <<

The System.IntPtr brillantly named DONT_USE_InternalThread holds the pointer to the OS thread. (My guess is that it's probably the handle from CreateThread, but I didn't investigate it too much.)


(Note to editors: brillant is spelled that way intentionally. Please don't 'fix' it)