Consider the following code sample, which creates an enumerable collection of integers and processes it in parallel:
using System.Collections.Generic;
using System.Threading.Tasks;
public class Program
{
public static void Main()
{
Parallel.ForEach(CreateItems(100), item => ProcessItem(item));
}
private static IEnumerable<int> CreateItems(int count)
{
for (int i = 0; i < count; i++)
{
yield return i;
}
}
private static void ProcessItem(int item)
{
// Do something
}
}
Is it guaranteed that the worker threads generated by Parallel.ForEach()
each get a different item or is some locking mechanism around incrementation and returning of i
required?
Parallel.ForEach<TSource>
, when TSource
is an IEnumerable<T>
, creates a partitioner for the IEnumerable<T>
that includes its own internal locking mechanism, so you don't need to implement any thread-safety in your iterator.
Whenever a worker thread requests a chunk of items, the partitioner will create an internal enumerator, which:
- acquires a shared lock
- iterates through the source (from where it was left of) to retrieve the chunk of items, saving the items in an private array
- releases the lock so that other chunk requests can be fulfilled.
- serves the worker thread from its private array.
As you see, the run through the IEnumerable<T>
for the purposes of partitioning is sequential (accessed via a shared lock), and the partitions are processed in parallel.
TPL and PLINQ use the concept of partitioners.
Partitioner is a type, that inherits Partitioner<TSource>
and serves for the splitting the source sequence into a number parts (or partitions). Built-in partitioners were designed to split the source sequence into nonoverlapping partitions.