Background
I have some code that performs batch HTML page processing using content from one specific host. It tries to make a large number (~400) of simultaneous HTTP requests using HttpClient
. I believe that the maximum number of simultaneous connections is restricted by ServicePointManager.DefaultConnectionLimit
, so I'm not applying my own concurrency restrictions.
After sending all of the requests asynchronously to HttpClient
using Task.WhenAll
, the entire batch operation can be cancelled using CancellationTokenSource
and CancellationToken
. The progress of the operation is viewable via a user interface, and a button can be clicked to perform the cancellation.
Problem
The call to CancellationTokenSource.Cancel()
blocks for roughly 5 - 30 seconds. This causes the user interface to freeze. Is suspect that this occurs because the method is calling the code that registered for cancellation notification.
What I've Considered
- Limiting the number of simultaneous HTTP request tasks. I consider this a work-around because
HttpClient
already seems to queue excess requests itself. - Performing the
CancellationTokenSource.Cancel()
method call in a non-UI thread. This didn't work too well; the task didn't actually run until most of the others had finished. I think anasync
version of the method would work well, but I couldn't find one. Also, I have the impression that it's suitable to use the method in a UI thread.
Demonstration
Code
class Program
{
private const int desiredNumberOfConnections = 418;
static void Main(string[] args)
{
ManyHttpRequestsTest().Wait();
Console.WriteLine("Finished.");
Console.ReadKey();
}
private static async Task ManyHttpRequestsTest()
{
using (var client = new HttpClient())
using (var cancellationTokenSource = new CancellationTokenSource())
{
var requestsCompleted = 0;
using (var allRequestsStarted = new CountdownEvent(desiredNumberOfConnections))
{
Action reportRequestStarted = () => allRequestsStarted.Signal();
Action reportRequestCompleted = () => Interlocked.Increment(ref requestsCompleted);
Func<int, Task> getHttpResponse = index => GetHttpResponse(client, cancellationTokenSource.Token, reportRequestStarted, reportRequestCompleted);
var httpRequestTasks = Enumerable.Range(0, desiredNumberOfConnections).Select(getHttpResponse);
Console.WriteLine("HTTP requests batch being initiated");
var httpRequestsTask = Task.WhenAll(httpRequestTasks);
Console.WriteLine("Starting {0} requests (simultaneous connection limit of {1})", desiredNumberOfConnections, ServicePointManager.DefaultConnectionLimit);
allRequestsStarted.Wait();
Cancel(cancellationTokenSource);
await WaitForRequestsToFinish(httpRequestsTask);
}
Console.WriteLine("{0} HTTP requests were completed", requestsCompleted);
}
}
private static void Cancel(CancellationTokenSource cancellationTokenSource)
{
Console.Write("Cancelling...");
var stopwatch = Stopwatch.StartNew();
cancellationTokenSource.Cancel();
stopwatch.Stop();
Console.WriteLine("took {0} seconds", stopwatch.Elapsed.TotalSeconds);
}
private static async Task WaitForRequestsToFinish(Task httpRequestsTask)
{
Console.WriteLine("Waiting for HTTP requests to finish");
try
{
await httpRequestsTask;
}
catch (OperationCanceledException)
{
Console.WriteLine("HTTP requests were cancelled");
}
}
private static async Task GetHttpResponse(HttpClient client, CancellationToken cancellationToken, Action reportStarted, Action reportFinished)
{
var getResponse = client.GetAsync("http://www.google.com", cancellationToken);
reportStarted();
using (var response = await getResponse)
response.EnsureSuccessStatusCode();
reportFinished();
}
}
Output
Why does cancellation block for so long? Also, is there anything that I'm doing wrong or could be doing better?
What this tells me is that you're probably suffering from 'threadpool exhaustion', which is where your threadpool queue has so many items in it (from HTTP requests completing) that it takes a while to get through them all. Cancellation probably is blocking on some threadpool work item executing and it can't skip to the head of the queue.
This suggests that you do need to go with option 1 from your consideration list. Throttle your own work so that the threadpool queue remains relatively short. This is good for app responsiveness overall anyway.
My favorite way to throttle async work is to use Dataflow. Something like this:
As an alternative, you could use Task.Factory.StartNew passing in TaskCreationOptions.LongRunning so your task gets a new thread (not affiliated with threadpool) which would allow it to start immediately and call Cancel from there. But you should probably solve the threadpool exhaustion problem instead.