Do I need to worry about blocking tasks?

2019-04-12 02:30发布

问题:

How much do I need to worry about blocking tasks in .NET? i.e. how does the .NET task scheduler handle blocking of threads in the thread pool and oversubscription?

E.g. if I have some IO in a task, should I always create it with the LongRunning hint? Or does the task scheduler heuristics handle it better? In C++ there is an Oversubscribe hint which works perfectly but I have not found any equivalent in .NET.

回答1:

The ThreadPool does detect when one of its threads blocks and it is a hint for it to add another thread to the pool. So, if you block a lot, the performance most likely won't be terrible, because ThreadPool will try to keep your CPU cores busy.

But having many blocked threads can be a performance problem, because it increases memory consumption and can lead to more context switches.

Also, this behavior may lead to decreased performance of IO. With spinning disks (HDDs), accessing many files at the same time causes lots of seeking, which can affect the performance drastically.



回答2:

You do need to worry about it if you want the most performant code.

The best way to handle it is to use the .Net 4.5 "await" style of I/O.

If you don't have .Net 4.5, you will have to use the older style of I/O (which works just as well, but is harder to use).

The non-blocking I/O described in those articles is by far the best way to do your I/O with multiple threads.

If you are not using I/O then you might still learn much from those articles.



回答3:

LongRunning signals to the TPL not to use a threadpool thread--it creates a non-threadpool thread to fulfill the request (e.g. new Thread(...)). this is not what you should be doing for IO. You should be using asynchronous IO. For example:

using(var response = (HttpWebResponse)await WebRequest.Create(url).GetResponseAsync())
    return response.StatusCode == HttpStatusCode.OK;

this ensures, wherever possible, overlapped IO is used--which uses the IO threadpool.

If you want to use a Task with a legacy APM API, you can use FromAsync:

Task<int> bytesRead = Task<int>.Factory.FromAsync( 
    stream.BeginRead, stream.EndRead, buffer, 0, buffer.Length, null);
await bytesRead;

If you need to deal with a legacy event async API, you can use TaskCompletionSource:

TaskCompletionSource<string[]> tcs = new TaskCompletionSource<string[]>();
WebClient[] webClients = new WebClient[urls.Length];
object m_lock = new object();
int count = 0;
List<string> results = new List<string>();

for (int i = 0; i < urls.Length; i++)
{
    webClients[i] = new WebClient();

    // Specify the callback for the DownloadStringCompleted 
    // event that will be raised by this WebClient instance.
    webClients[i].DownloadStringCompleted += (obj, args) =>
    {
        // Argument validation and exception handling omitted for brevity. 

        // Split the string into an array of words, 
        // then count the number of elements that match 
        // the search term. 
        string[] words = args.Result.Split(' ');
        string NAME = name.ToUpper();
        int nameCount = (from word in words.AsParallel()
                         where word.ToUpper().Contains(NAME)
                         select word)
                        .Count();

        // Associate the results with the url, and add new string to the array that  
        // the underlying Task object will return in its Result property.
        results.Add(String.Format("{0} has {1} instances of {2}", args.UserState, nameCount, name));

        // If this is the last async operation to complete, 
        // then set the Result property on the underlying Task. 
        lock (m_lock)
        {
            count++;
            if (count == urls.Length)
            {
                tcs.TrySetResult(results.ToArray());
            }
        }
    };

    // Call DownloadStringAsync for each URL.
    Uri address = null;
    address = new Uri(urls[i]);
    webClients[i].DownloadStringAsync(address, address);

} // end for 

await tcs.Task;