Freeze when programmatically launching a batch fil

2019-01-28 10:50发布

问题:

EDIT: Found some duplicates with no answers:

  1. Issue with output redirection in batch
  2. How do I use the "start" command without inheriting handles in the child process?

I have some C# code that tries to be a generic process launcher, within a larger long-running program. This code needs to capture the output from the processes it launches, and also wait for them to finish. We typically launch batch files with this code, and everything works fine, except when we want to start another child process from inside the batch file, such that it will outlive the batch file process. As an example, let's say I simply want to execute "start notepad.exe" from inside the batch file.

I encountered the same problems as in this question: Process.WaitForExit doesn't return even though Process.HasExited is true

Basically, even though the batch file process appears to be exiting very quickly (as expected), my program freezes until the child process (eg. notepad) also exits. However, notepad is not meant to exit, in my scenario.

I tried injecting "cmd.exe /C" at all points in the chain, with no luck. I tried explicitly terminating the batch file with "exit" or "exit /B". I tried reading the output both synchronously and asynchronously - with or without worker threads. I tried the patterns here: https://github.com/alabax/CsharpRedirectStandardOutput/tree/master/RedirectStandardOutputLibrary (see FixedSimplePattern.cs and AdvancedPattern.cs), again with no luck.

EDIT: I also tried with some C# P/Invoke code that does the process launching via the Windows API (CreatePipe/CreateProcess, etc), so I don't think this problem is specific to the C# Process API.

The only workaround I found was to replace the start command with a tool that calls CreateProcess with the DETACHED_PROCESS flag (CREATE_NO_WINDOW also works).

The accepted answer in the aforementioned SO question (https://stackoverflow.com/a/26722542/5932003) is the closest thing in the entire Internet that would appear to work, but it turns out it's leaking threads with every batch file you launch. I would have left a comment there, but I don't have the reputation to do that yet :).

Modified code that demonstrates thread leakage:

using System;
using System.Diagnostics;
using System.IO;
using System.Threading;
using System.Threading.Tasks;

namespace TestSO26713374WaitForExit
{
    class Program
    {
    static void Main(string[] args)
    {
        while(true)
        {
            string foobat =
@"@echo off
START ping -t localhost
REM START ping -t google.com
REM ECHO Batch file is done!
EXIT /B 123
";

            File.WriteAllText("foo.bat", foobat);

            Process p = new Process
            {
                StartInfo =
                new ProcessStartInfo("foo.bat")
                {
                    UseShellExecute = false,
                    RedirectStandardOutput = true,
                    RedirectStandardError = true
                }
            };

            p.Start();

            var _ = ConsumeReader(p.StandardOutput);
            _ = ConsumeReader(p.StandardError);

            //Console.WriteLine("Calling WaitForExit()...");
            p.WaitForExit();
            //Console.WriteLine("Process has exited. Exit code: {0}", p.ExitCode);
            //Console.WriteLine("WaitForExit returned.");


            ThreadPool.GetMaxThreads(out int max, out int max2);
            ThreadPool.GetAvailableThreads(out int available, out int available2);

            Console.WriteLine(
                $"Active thread count: {max - available} (thread pool), {System.Diagnostics.Process.GetCurrentProcess().Threads.Count} (all).");

            Thread.Sleep(8000);
        }
    }

    async static Task ConsumeReader(TextReader reader)
    {
        string text;

        while ((text = await reader.ReadLineAsync()) != null)
        {
            Console.WriteLine(text);
        }
    }
}
}

The output from the above:

Active thread count: 2 (thread pool), 15 (all).
Active thread count: 4 (thread pool), 18 (all).
Active thread count: 6 (thread pool), 19 (all).
Active thread count: 8 (thread pool), 20 (all).
Active thread count: 9 (thread pool), 21 (all).
Active thread count: 11 (thread pool), 23 (all).
Active thread count: 13 (thread pool), 25 (all).
Active thread count: 15 (thread pool), 27 (all).
Active thread count: 17 (thread pool), 29 (all).
Active thread count: 19 (thread pool), 31 (all).
Active thread count: 21 (thread pool), 33 (all).
...

My questions:

  1. Why doesn't the start command completely break the chain of output redirection?
  2. Am I stuck with the aforementioned tool that calls CreateProcess(...DETACHED_PROCESS...)?

Thanks!

回答1:

Here's a drop-in replacement for the start command:

start /b powershell.exe Start-Process -FilePath "notepad.exe"

"start /b" is only there to start powershell without showing its window (which would otherwise flash for a second - annoying). Powershell takes over after that and launches our process, without any of the side-effects.

If powershell isn't an option, it turns out that this super simple C# program can serve the same purpose:

using System.Diagnostics;

namespace Test
{
    class Program
    {
        static void Main(string[] args)
        {
            // You may want to expand this to set CreateNoWindow = true
            Process.Start(args[0]);
        }
    }
}

No need for CreateProcess(...DETACHED_PROCESS...).



回答2:

I don't have an answer for the first question. That's more of a fundamental design or implementation detail of Windows, and I can't speak to why they decided to do it that way.

As for the second question…

Unfortunately, this appears to be a limitation of the Process class, due to the way it was implemented.

Using the asynchronous reading from the stdout and stderr streams resolves the WaitForExit() issue, but doesn't do anything to address the underlying way Windows ties the parent process to the child process. The parent's output streams (which are redirected) won't be closed until the child exits, and so there are still outstanding reads in the C# program on the parent's output streams, waiting for those streams to be closed.

In the Process class, for redirected output, it wraps the I/O stream handle in a FileStream object, and when it creates this object, it does not create it with the async flag set to true, which would be required for the object to use IOCP for asynchronous operations. It doesn't even create the underlying native pipe object with IOCP support.

So, when the code issues an asynchronous read on the stream, this is implemented in .NET by "faking" asynchrony. That is, it just queues a synchronous read on the regular thread pool.

So you get a new active thread pool thread for every outstanding read, two per process. And the last reads from the output streams won't return until the child process exits, tying up those thread pool threads.

I don't see any great way to avoid this.

For a process which you know will have a very tiny amount of output on the stdout and stderr streams (or none at all…in the example at hand, nothing at all is written to stderr, for example), you could just not read the streams. But the buffer for each of those streams is not very large, so failing to read them will generally cause the process to eventually block.

In some special cases, you could arrange things so that you don't read from the streams when you know you're at the end. For example, in the code above, you could uncomment the "Batch file is done!" and terminate the ConsumeReader() loop when you see that text. But that's not a solution that will work in many cases. It would rely on knowing exactly what is going to be written, at least for the very last line, of both the stdout and stderr streams.

For what it's worth, at least I'd say that you're not literally leaking threads. Each thread is in fact doing something, and each thread will in fact finally be freed up, once the associated process does exit. And frankly, on Windows, a thread costs a lot less than a process, so if you have enough child processes running concurrently that you're worried about the number of threads in your one process, you've probably got bigger fish to fry (i.e. all those child processes eating up resources).

After investigating what I could, I think that your current solution, a tool used in lieu of the start command to create your process as a detached child, is probably the most elegant. The only reliable alternative would be for you to effectively re-implement the Process class itself (or at least, the parts you are using here), except with IOCP support so that reads on the standard streams are in fact implemented asynchronously.

I suppose you could try to report the issue to Microsoft, but I'd guess at this point, they've no interest in making any changes to the Process class.