How to use C#8 IAsyncEnumerable to async-enumer

2020-02-29 12:42发布

问题:

If possible I want to create an async-enumerator for tasks launched in parallel. So first to complete is first element of the enumeration, second to finish is second element of the enumeration, etc.

public static async IAsyncEnumerable<T> ParallelEnumerateAsync(this IEnumerable<Task<T>> coldAsyncTasks)
{
    // ... 
}

I bet there is a way using ContinueWith and a Queue<T>, but I don't completely trust myself to implement it.

回答1:

If I understand your question right, your focus is to launch all tasks, let them all run in parallel, but make sure the return values are processed in the same order as the tasks were launched.

Checking out the specs, with C# 8.0 Asynchronous Streams task queuing for parallel execution but sequential return can look like this.

/// Demonstrates Parallel Execution - Sequential Results with test tasks
async Task RunAsyncStreams()
{
    await foreach (var n in RunAndPreserveOrderAsync(GenerateTasks(6)))
    {
        Console.WriteLine($"#{n} is returned");
    }
}

/// Returns an enumerator that will produce a number of test tasks running
/// for a random time.
IEnumerable<Task<int>> GenerateTasks(int count)
{
    return Enumerable.Range(1, count).Select(async n =>
    {
        await Task.Delay(new Random().Next(100, 1000));
        Console.WriteLine($"#{n} is complete");
        return n;
    });
}

/// Launches all tasks in order of enumeration, then waits for the results
/// in the same order: Parallel Execution - Sequential Results.
async IAsyncEnumerable<T> RunAndPreserveOrderAsync<T>(IEnumerable<Task<T>> tasks)
{
    var queue = new Queue<Task<T>>(tasks);
    while (queue.Count > 0) yield return await queue.Dequeue();
}

Possible output:

#5 is complete
#1 is complete
#1 is returned
#3 is complete
#6 is complete
#2 is complete
#2 is returned
#3 is returned
#4 is complete
#4 is returned
#5 is returned
#6 is returned

On a practical note, there doesn't seem to be any new language-level support for this pattern, and besides since the asynchronous streams deal with IAsyncEnumerable<T>, it means that a base Task would not work here and all the worker async methods should have the same Task<T> return type, which somewhat limits asynchronous streams-based design.

Because of this and depending on your situation (Do you want to be able to cancel long-running tasks? Is per-task exception handling required? Should there be a limit to the number of concurrent tasks?) it might make sense to check out @TheGeneral 's suggestions up there.

Update:

Note that RunAndPreserveOrderAsync<T> does not necessarily have to use a Queue of tasks - this was only chosen to better show coding intentions.

var queue = new Queue<Task<T>>(tasks);
while (queue.Count > 0) yield return await queue.Dequeue();

Converting an enumerator to List would produce the same result; the body of RunAndPreserveOrderAsync<T> can be replaced with one line here

foreach(var task in tasks.ToList()) yield return await task;

In this implementation it is important that all the tasks are generated and launched first, which is done along with Queue initialization or a conversion of tasks enumerable to List. However, it might be hard to resist simplifying the above foreach line like this

foreach(var task in tasks) yield return await task;

which would cause the tasks being executed sequentially and not running in parallel.



回答2:

Is this what you're looking for?

public static async IAsyncEnumerable<T> ParallelEnumerateAsync<T>(
    this IEnumerable<Task<T>> tasks)
{
    var remaining = new List<Task<T>>(tasks);

    while (remaining.Count != 0)
    {
        var task = await Task.WhenAny(remaining);
        remaining.Remove(task);
        yield return (await task);
    }
}


回答3:

Here is a version that also allows to specify the maximum degree of parallelism. The idea is that the tasks are enumerated with a lag. For example for degreeOfParallelism: 4 the first 4 tasks are enumerated immediately, causing them to be created, and then the first one of these is awaited. Next the 5th task is enumerated and the 2nd is awaited, and so on.

To keep things tidy, the Lag method is embedded inside the ParallelEnumerateAsync method as a static local function (new feature of C# 8).

Since the enumeration of the coldTasks enumerable will be most probably driven from multiple threads, it is enumerated using a thread-safe wrapper.

public static async IAsyncEnumerable<TResult> ParallelEnumerateAsync<TResult>(
    this IEnumerable<Task<TResult>> coldTasks, int degreeOfParallelism)
{
    if (degreeOfParallelism < 1)
        throw new ArgumentOutOfRangeException(nameof(degreeOfParallelism));

    if (coldTasks is ICollection<Task<TResult>>) throw new ArgumentException(
        "The enumerable should not be materialized.", nameof(coldTasks));

    foreach (var task in Safe(Lag(coldTasks, degreeOfParallelism - 1)))
    {
        yield return await task.ConfigureAwait(false);
    }

    static IEnumerable<T> Lag<T>(IEnumerable<T> source, int count)
    {
        var queue = new Queue<T>();
        using (var enumerator = source.GetEnumerator())
        {
            int index = 0;
            while (enumerator.MoveNext())
            {
                queue.Enqueue(enumerator.Current);
                index++;
                if (index > count) yield return queue.Dequeue();
            }
        }
        while (queue.Count > 0) yield return queue.Dequeue();
    }

    static IEnumerable<T> Safe<T>(IEnumerable<T> source)
    {
        var locker = new object();
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                T item;
                lock (locker)
                {
                    if (!enumerator.MoveNext()) yield break;
                    item = enumerator.Current;
                }
                yield return item;
            }
        }
    }
}