How to find out deadlocking awaited Tasks and thei

2020-05-28 08:31发布

问题:

Here is one simplified example I found that really hard to debug deadlock in awaited tasks in some cases:

class Program
{
    static void Main(string[] args)
    {
        var task = Hang();

        task.Wait();
    }

    static async Task Hang()
    {
        var tcs = new TaskCompletionSource<object>();

        // do some more stuff. e.g. another await Task.FromResult(0);

        await tcs.Task;

        tcs.SetResult(0);
    }
}

This example is easy to understand why it deadlocks, it is awaiting a task which is finished later on. It looks stupid but similar scenario could happen in more complicated production code and deadlocks could be mistakenly introduced due to lack of multithreading experience.

Interesting thing for this example is inside Hang method there is no thread blocking code like Task.Wait() or Task.Result. Then when I attach VS debugger, it just shows the main thread is waiting for the task to finish. However, there is no thread showing where the code has stopped inside Hang method using Parallel Stacks view.

Here are the call stacks on each thread (3 in all) I have in the Parallel Stacks:

Thead 1:

[Managed to Native Transition]
Microsoft.VisualStudio.HostingProcess.HostProc.WaitForThreadExit
Microsoft.VisualStudio.HostingProcess.HostProc.RunParkingWindowThread
System.Threading.ThreadHelper.ThreadStart_Context
System.Threading.ExecutionContext.RunInternal
System.Threading.ExecutionContext.Run
System.Threading.ExecutionContext.Run
System.Threading.ThreadHelper.ThreadStart

Thread 2:

[Managed to Native Transition]
Microsoft.Win32.SystemEvents.WindowThreadProc
System.Threading.ThreadHelper.ThreadStart_Context
System.Threading.ExecutionContext.RunInternal
System.Threading.ExecutionContext.Run
System.Threading.ExecutionContext.Run
System.Threading.ThreadHelper.ThreadStart

Main Thread:

System.Threading.Monitor.Wait
System.Threading.Monitor.Wait
System.Threading.ManualResetEventSlim.Wait
System.Threading.Tasks.Task.SpinThenBlockingWait
System.Threading.Tasks.Task.InternalWait
System.Threading.Tasks.Task.Wait
System.Threading.Tasks.Task.Wait
ConsoleApplication.Program.Main Line 12 //this is our Main function
[Native to Managed Transition]
[Managed to Native Transition]
System.AppDomain.ExecuteAssembly
Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly
System.Threading.ThreadHelper.ThreadStart_Context
System.Threading.ExecutionContext.RunInternal
System.Threading.ExecutionContext.Run
System.Threading.ExecutionContext.Run
System.Threading.ThreadHelper.ThreadStart

Is there anyway to find out where the task has stopped inside Hang method? And the call stack if possible? I believe internally there must be some states about each task and their continuation points so that the scheduler could work. But I don't know how to check that out.

回答1:

Inside of Visual Studio, I am not aware of a way to debug this sort of situation simply. However, there are two other ways to visualize this for full framework applications, plus a bonus preview of a way to do this in .NET Core 3.

tldr version: Yep, its hard, and Yep, the information you want is there, just difficult to find. Once you find the heap objects as per the below methods, you can use the address of them in the VS watch window to use the visualizers to take a deeper dive.

WinDbg

WinDbg has a primitive but helpful extension that provides a !dumpasync command.

If you download the extension from the vs-threading release branch and copy the x64 and x86 AsyncDebugTools.dll to C:\Program Files (x86)\Windows Kits\10\Debuggers\[x86|x64]\winext folders, you can do the following:

.load AsyncDebugTools
!dumpasync

The output (taken from the link above) looks like:

07494c7c <0> Microsoft.Cascade.Rpc.RpcSession+<SendRequestAsync>d__49
.07491d10 <1> Microsoft.Cascade.Agent.WorkspaceService+<JoinRemoteWorkspaceAsync>d__28
..073c8be4 <5> Microsoft.Cascade.Agent.WorkspaceService+<JoinWorkspaceAsync>d__22
...073b7e94 <0> Microsoft.Cascade.Rpc.RpcDispatcher`1+<>c__DisplayClass23_2+<<BuildMethodMap>b__2>d[[Microsoft.Cascade.Contracts.IWorkspaceService, Microsoft.Cascade.Common]]
....073b60e0 <0> Microsoft.Cascade.Rpc.RpcServiceUtil+<RequestAsync>d__3
.....073b366c <0> Microsoft.Cascade.Rpc.RpcSession+<ReceiveRequestAsync>d__42
......073b815c <0> Microsoft.Cascade.Rpc.RpcSession+<>c__DisplayClass40_1+<<Receive>b__0>d

On your sample above the output is less interesting:

033a23c8 <0> StackOverflow41476418.Program+<Hang>d__1

The description of the output is:

The output above is a set of stacks – not exactly callstacks, but actually "continuation stacks". A continuation stack is synthesized based on what code has 'awaited' the call to an async method. It's possible that the Task returned by an async method was awaited from multiple places (e.g. the Task was stored in a field, then awaited by multiple interested parties). When there are multiple awaiters, the stack can branch and show multiple descendents of a given frame. The stacks above are therefore actually "trees", and the leading dots at each frame helps recognize when trees have multiple branches.

If an async method is invoked but not awaited on, the caller won't appear in the continuation stack.

Once you see the nested hierarchy for more complex situations, you can at least deep-dive into the state objects and find their continuations and roots.

LinqPad and ClrMd

Another useful too is LinqPad coupled with ClrMd and ClrMD.Extensions. The latter package is used to bridge ClrMd into LINQPad - there is a getting started guide. Once you have the packages/namespaces set, this query is what you want:

var session = ClrMD.Extensions.ClrMDSession.LoadCrashDump(@"dmpfile.dmp");
var stateMachineTypes = (
    from type in session.Heap.EnumerateTypes()
    where type.Interfaces.Any(item => item.Name == "System.Runtime.CompilerServices.IAsyncStateMachine")
    select type);
session.Heap.EnumerateDynamicObjects(stateMachineTypes).Dump(2);

Below is a sample of the output running on your sample code:

DotNet Core 3

For .NET Core 3.x they added !dumpasync into the WinDbg sos extension. It is MUCH better than the extension described above, as it gives much more context. You can see it is part of a much larger user story to improve debugging of async code. Here is the output from that under .NET Core 3.0 preview 6 with a preview 7 version of SOS with extended options. Note that line numbers are present, something you don't get with the above options.:

0:000> !dumpasync -stacks -roots
Statistics:
              MT    Count    TotalSize Class Name
00007ffb564e9be0        1           96 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]
Total 1 objects
In 1 chains.
         Address               MT     Size      State Description
00000209915d21a8 00007ffb564e9be0       96          0 StackOverflow41476418_Core.Program+<Hang>d__1
Async "stack":
.00000209915d2738 System.Threading.Tasks.Task+SetOnInvokeMres
GC roots:
    Thread bc20:
        000000e08057e8c0 00007ffbb580a292 System.Threading.Tasks.Task.SpinThenBlockingWait(Int32, System.Threading.CancellationToken) [/_/src/System.Private.CoreLib/shared/System/Threading/Tasks/Task.cs @ 2939]
            rbp+10: 000000e08057e930
                ->  00000209915d21a8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]

        000000e08057e930 00007ffbb580a093 System.Threading.Tasks.Task.InternalWaitCore(Int32, System.Threading.CancellationToken) [/_/src/System.Private.CoreLib/shared/System/Threading/Tasks/Task.cs @ 2878]
            rsi: 
                ->  00000209915d21a8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]

        000000e08057e9b0 00007ffbb5809f0a System.Threading.Tasks.Task.Wait(Int32, System.Threading.CancellationToken) [/_/src/System.Private.CoreLib/shared/System/Threading/Tasks/Task.cs @ 2789]
            rsi: 
                ->  00000209915d21a8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]

Windows symbol path parsing FAILED
        000000e08057ea10 00007ffb56421f17 StackOverflow41476418_Core.Program.Main(System.String[]) [C:\StackOverflow41476418_Core\Program.cs @ 12]
            rbp+28: 000000e08057ea38
                ->  00000209915d21a8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]

        000000e08057ea10 00007ffb56421f17 StackOverflow41476418_Core.Program.Main(System.String[]) [C:\StackOverflow41476418_Core\Program.cs @ 12]
            rbp+30: 000000e08057ea40
                ->  00000209915d21a8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]