If possible I want to create an async-enumerator for tasks launched in parallel. So first to complete is first element of the enumeration, second to finish is second element of the enumeration, etc.
public static async IAsyncEnumerable<T> ParallelEnumerateAsync(this IEnumerable<Task<T>> coldAsyncTasks)
{
// ...
}
I bet there is a way using ContinueWith
and a Queue<T>
, but I don't completely trust myself to implement it.
If I understand your question right, your focus is to launch all tasks, let them all run in parallel, but make sure the return values are processed in the same order as the tasks were launched.
Checking out the specs, with C# 8.0 Asynchronous Streams task queuing for parallel execution but sequential return can look like this.
/// Demonstrates Parallel Execution - Sequential Results with test tasks
async Task RunAsyncStreams()
{
await foreach (var n in RunAndPreserveOrderAsync(GenerateTasks(6)))
{
Console.WriteLine($"#{n} is returned");
}
}
/// Returns an enumerator that will produce a number of test tasks running
/// for a random time.
IEnumerable<Task<int>> GenerateTasks(int count)
{
return Enumerable.Range(1, count).Select(async n =>
{
await Task.Delay(new Random().Next(100, 1000));
Console.WriteLine($"#{n} is complete");
return n;
});
}
/// Launches all tasks in order of enumeration, then waits for the results
/// in the same order: Parallel Execution - Sequential Results.
async IAsyncEnumerable<T> RunAndPreserveOrderAsync<T>(IEnumerable<Task<T>> tasks)
{
var queue = new Queue<Task<T>>(tasks);
while (queue.Count > 0) yield return await queue.Dequeue();
}
Possible output:
#5 is complete
#1 is complete
#1 is returned
#3 is complete
#6 is complete
#2 is complete
#2 is returned
#3 is returned
#4 is complete
#4 is returned
#5 is returned
#6 is returned
On a practical note, there doesn't seem to be any new language-level support for this pattern, and besides since the asynchronous streams deal with IAsyncEnumerable<T>
, it means that a base Task
would not work here and all the worker async
methods should have the same Task<T>
return type, which somewhat limits asynchronous streams-based design.
Because of this and depending on your situation (Do you want to be able to cancel long-running tasks? Is per-task exception handling required? Should there be a limit to the number of concurrent tasks?) it might make sense to check out @TheGeneral 's suggestions up there.
Update:
Note that RunAndPreserveOrderAsync<T>
does not necessarily have to use a Queue
of tasks - this was only chosen to better show coding intentions.
var queue = new Queue<Task<T>>(tasks);
while (queue.Count > 0) yield return await queue.Dequeue();
Converting an enumerator to List
would produce the same result; the body of RunAndPreserveOrderAsync<T>
can be replaced with one line here
foreach(var task in tasks.ToList()) yield return await task;
In this implementation it is important that all the tasks are generated and launched first, which is done along with Queue
initialization or a conversion of tasks
enumerable to List
. However, it might be hard to resist simplifying the above foreach
line like this
foreach(var task in tasks) yield return await task;
which would cause the tasks being executed sequentially and not running in parallel.
Is this what you're looking for?
public static async IAsyncEnumerable<T> ParallelEnumerateAsync<T>(
this IEnumerable<Task<T>> tasks)
{
var remaining = new List<Task<T>>(tasks);
while (remaining.Count != 0)
{
var task = await Task.WhenAny(remaining);
remaining.Remove(task);
yield return (await task);
}
}
Here is a version that also allows to specify the maximum degree of parallelism. The idea is that the tasks are enumerated with a lag. For example for degreeOfParallelism: 4
the first 4 tasks are enumerated immediately, causing them to be created, and then the first one of these is awaited. Next the 5th task is enumerated and the 2nd is awaited, and so on.
To keep things tidy, the Lag
method is embedded inside the ParallelEnumerateAsync
method as a static local function (new feature of C# 8).
Since the enumeration of the coldTasks
enumerable will be most probably driven from multiple threads, it is enumerated using a thread-safe wrapper.
public static async IAsyncEnumerable<TResult> ParallelEnumerateAsync<TResult>(
this IEnumerable<Task<TResult>> coldTasks, int degreeOfParallelism)
{
if (degreeOfParallelism < 1)
throw new ArgumentOutOfRangeException(nameof(degreeOfParallelism));
if (coldTasks is ICollection<Task<TResult>>) throw new ArgumentException(
"The enumerable should not be materialized.", nameof(coldTasks));
foreach (var task in Safe(Lag(coldTasks, degreeOfParallelism - 1)))
{
yield return await task.ConfigureAwait(false);
}
static IEnumerable<T> Lag<T>(IEnumerable<T> source, int count)
{
var queue = new Queue<T>();
using (var enumerator = source.GetEnumerator())
{
int index = 0;
while (enumerator.MoveNext())
{
queue.Enqueue(enumerator.Current);
index++;
if (index > count) yield return queue.Dequeue();
}
}
while (queue.Count > 0) yield return queue.Dequeue();
}
static IEnumerable<T> Safe<T>(IEnumerable<T> source)
{
var locker = new object();
using (var enumerator = source.GetEnumerator())
{
while (true)
{
T item;
lock (locker)
{
if (!enumerator.MoveNext()) yield break;
item = enumerator.Current;
}
yield return item;
}
}
}
}