Nested Parallel.ForEach Loops on the same list?

2019-01-26 05:30发布

I need to parallelize a method that does an exhaustive pairwise comparison on elements in a list. The serial implementation is straight-forward:

foreach (var element1 in list)
    foreach (var element2 in list)
        foo(element1, element2);

In this case, foo won't alter the state of element1 or element2. I know it's not safe to simply do nested Parallel.ForEach statements:

Parallel.ForEach(list, delegate(A element1)
{
    Parallel.ForEach(list, delegate(A element2)
    {
        foo(element1, element2);
    });
});

What would be the ideal way to implement this using the parallel tasks library?

3条回答
甜甜的少女心
2楼-- · 2019-01-26 06:23

The two nested loops essentially mean that you want to foo the cartessian product of the list with itself. You can parallelize the entire operation by first creating all pairs in a temporary list, then iterating over that list with Parallel.ForEach.

EDIT: Instead of creating a list of all combinations, you can use an iterator to return a 2-element tuple with the combination. Parallel.ForEach will still parallelize the processing of the tuples.

The following sample prints out the current iteration step to show that results come back out-of-order, as would be expected during parallel processing:

 const int SIZE = 10;
    static void Main(string[] args)
    {
        List<int> list = new List<int>(SIZE);
        for(int i=0;i<SIZE;i++)
        {
            list.Add(i);
        }


        Parallel.ForEach(GetCombinations(list),(t,state,l)=>
            Console.WriteLine("{0},{1},{2}",l,t.Item1,t.Item2));

    }

    static IEnumerable<Tuple<int,int>> GetCombinations(List<int> list)
    {
        for(int i=0;i<list.Count;i++)
            for(int j=0;j<list.Count;j++)
                yield return Tuple.Create(list[i],list[j]);
    }
查看更多
爷的心禁止访问
3楼-- · 2019-01-26 06:30

Couldn't you just have one Parallel and one normal loop? So either

Parallel.ForEach(list, delegate(A element1)
{
  foreach(A element2 in list)
    foo(element1, element2)
});

or

foreach(A element1 in list)
{
  Parallel.ForEach(list, delegate(A element2)
  {
    foo(element1, element2);
  });
}

Should speed it up as well. There was never going to be a thread per cycle anyway, so this would probably be just as fast or slightly slower than nested parallel loops.

查看更多
Explosion°爆炸
4楼-- · 2019-01-26 06:37

At least if you are executing the code on a machine where the number of cores is at least twice the number of items in the list, I'm not sure it is a good idea to do embedded Parallel.ForEachs.

In other words, if you target a quad-core, and the list has one thousand items, just parallelize the parent loop. Parallelizing both loops would not make the code faster, but rather much, much slower, since parallel tasks have performance cost.

alt text http://www.freeimagehosting.net/uploads/ca97f403f8.png

At each iteration, a few milliseconds will be lost by Parallel.ForEach to determine which thread must execute the next iteration. Let's say you have a set of 7 items. If you parallelize the parent loop, those milliseconds will be lost 7 times. If you parallelize both loops, they will be lost 7×7=49 times instead. Larger is the set, bigger is the overheat.

查看更多
登录 后发表回答