Task.WhenAny with cancellation of the non complete

2020-02-13 12:53发布

问题:

In my app I am creating some concurrent web requests and I am satisfied when any one of them completes, so I am using the method Task.WhenAny:

var urls = new string[] {
    "https://stackoverflow.com",
    "https://superuser.com",
    "https://www.reddit.com/r/chess",
};
var tasks = urls.Select(async url =>
{
    using (var webClient = new WebClient())
    {
        return (Url: url, Data: await webClient.DownloadStringTaskAsync(url));
    }
}).ToArray();
var firstTask = await Task.WhenAny(tasks);
Console.WriteLine($"First Completed Url: {firstTask.Result.Url}");
Console.WriteLine($"Data: {firstTask.Result.Data.Length:#,0} chars");

First Completed Url: https://superuser.com
Data: 121.954 chars

What I don't like to this implementation is that the non-completed tasks continue downloading data I no longer need, and waste bandwidth I would prefer to preserve for my next batch of requests. So I am thinking about cancelling the other tasks, but I am not sure how to do it. I found how to use a CancellationToken to cancel a specific web request:

public static async Task<(string Url, string Data)> DownloadUrl(
    string url, CancellationToken cancellationToken)
{
    try
    {
        using (var webClient = new WebClient())
        {
            cancellationToken.Register(webClient.CancelAsync);
            return (url, await webClient.DownloadStringTaskAsync(url));
        }
    }
    catch (WebException ex) when (ex.Status == WebExceptionStatus.RequestCanceled)
    {
        cancellationToken.ThrowIfCancellationRequested();
        throw;
    }
}

Now I need an implementation of Task.WhenAny that will take an array of urls, and will use my DownloadUrl function to fetch the data of the fastest responding site, and will handle the cancellation logic of the slower tasks. It would be nice if it had a timeout argument, to offer protection against never-ending tasks. So I need something like this:

public static Task<Task<TResult>> WhenAnyEx<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, CancellationToken, Task<TResult>> taskFactory,
    int timeout)
{
    // What to do here?
}

Any ideas?

回答1:

Simply pass to all of your tasks the same cancellation token, something like this:

CancellationTokenSource cts = new CancellationTokenSource();
CancellationToken ct = cts.Token;
// here you specify how long you want to wait for task to finish before cancelling
int timeout = 5000;
cts.CancelAfter(timeout);
// pass ct to all your tasks and start them
await Task.WhenAny(/* your tasks here */);
// cancel all tasks
cts.Cancel();

Also, you need to read this thread to be aware of how to use CancellationToken correctly: When I use CancelAfter(), the Task is still running



回答2:

Update: better solution based on Stephen Cleary's answer and MSDN and svick's answer:

CancellationTokenSource source = new CancellationTokenSource();
source.CancelAfter(TimeSpan.FromSeconds(1));

var tasks = urls.Select(url => Task.Run( async () => 
{
    using (var webClient = new WebClient())
    {
        token.Register(webClient.CancelAsync);
        var result = (Url: url, Data: await webClient.DownloadStringTaskAsync(url));
        token.ThrowIfCancellationRequested();
        return result.Url;
    }
}, token)).ToArray();

string url;
try
{
    // (A canceled task will raise an exception when awaited).
    var firstTask = await Task.WhenAny(tasks);
    url = (await firstTask).Url;
}   
catch (AggregateException ae) {
   foreach (Exception e in ae.InnerExceptions) {
      if (e is TaskCanceledException)
         Console.WriteLine("Timeout: {0}", 
                           ((TaskCanceledException) e).Message);
      else
         Console.WriteLine("Exception: " + e.GetType().Name);
   }
}

non-optimal solution

The timeout can be solved by adding a task that just waits and completes after given time. Then you check which task completed first, if it is the waiting one, then timeout effectively occurred.

Task timeout = Task.Delay(10000);
var firstTask = await Task.WhenAny(tasks.Concat(new Task[] {timeout}));
if(firstTask == timeout) { ... } //timed out
source.Cancel();