Thread safety of yield return with Parallel.ForEac

2020-03-11 02:25发布

问题:

Consider the following code sample, which creates an enumerable collection of integers and processes it in parallel:

using System.Collections.Generic;
using System.Threading.Tasks;

public class Program
{
    public static void Main()
    {
        Parallel.ForEach(CreateItems(100), item => ProcessItem(item));
    }

    private static IEnumerable<int> CreateItems(int count)
    {
        for (int i = 0; i < count; i++)
        {
            yield return i;
        }
    }

    private static void ProcessItem(int item)
    {
        // Do something
    }
}

Is it guaranteed that the worker threads generated by Parallel.ForEach() each get a different item or is some locking mechanism around incrementation and returning of i required?

回答1:

Parallel.ForEach<TSource>, when TSource is an IEnumerable<T>, creates a partitioner for the IEnumerable<T> that includes its own internal locking mechanism, so you don't need to implement any thread-safety in your iterator.

Whenever a worker thread requests a chunk of items, the partitioner will create an internal enumerator, which:

  1. acquires a shared lock
  2. iterates through the source (from where it was left of) to retrieve the chunk of items, saving the items in an private array
  3. releases the lock so that other chunk requests can be fulfilled.
  4. serves the worker thread from its private array.

As you see, the run through the IEnumerable<T> for the purposes of partitioning is sequential (accessed via a shared lock), and the partitions are processed in parallel.



回答2:

TPL and PLINQ use the concept of partitioners.

Partitioner is a type, that inherits Partitioner<TSource> and serves for the splitting the source sequence into a number parts (or partitions). Built-in partitioners were designed to split the source sequence into nonoverlapping partitions.