I need to parallelize a method that does an exhaustive pairwise comparison on elements in a list. The serial implementation is straight-forward:
foreach (var element1 in list)
foreach (var element2 in list)
foo(element1, element2);
In this case, foo won't alter the state of element1 or element2. I know it's not safe to simply do nested Parallel.ForEach statements:
Parallel.ForEach(list, delegate(A element1)
{
Parallel.ForEach(list, delegate(A element2)
{
foo(element1, element2);
});
});
What would be the ideal way to implement this using the parallel tasks library?
Couldn't you just have one Parallel and one normal loop? So either
Parallel.ForEach(list, delegate(A element1)
{
foreach(A element2 in list)
foo(element1, element2)
});
or
foreach(A element1 in list)
{
Parallel.ForEach(list, delegate(A element2)
{
foo(element1, element2);
});
}
Should speed it up as well. There was never going to be a thread per cycle anyway, so this would probably be just as fast or slightly slower than nested parallel loops.
At least if you are executing the code on a machine where the number of cores is at least twice the number of items in the list, I'm not sure it is a good idea to do embedded Parallel.ForEach
s.
In other words, if you target a quad-core, and the list has one thousand items, just parallelize the parent loop. Parallelizing both loops would not make the code faster, but rather much, much slower, since parallel tasks have performance cost.
alt text http://www.freeimagehosting.net/uploads/ca97f403f8.png
At each iteration, a few milliseconds will be lost by Parallel.ForEach
to determine which thread must execute the next iteration. Let's say you have a set of 7 items. If you parallelize the parent loop, those milliseconds will be lost 7 times. If you parallelize both loops, they will be lost 7×7=49 times instead. Larger is the set, bigger is the overheat.
The two nested loops essentially mean that you want to foo the cartessian product of the list with itself. You can parallelize the entire operation by first creating all pairs in a temporary list, then iterating over that list with Parallel.ForEach.
EDIT: Instead of creating a list of all combinations, you can use an iterator to return a 2-element tuple with the combination. Parallel.ForEach will still parallelize the processing of the tuples.
The following sample prints out the current iteration step to show that results come back out-of-order, as would be expected during parallel processing:
const int SIZE = 10;
static void Main(string[] args)
{
List<int> list = new List<int>(SIZE);
for(int i=0;i<SIZE;i++)
{
list.Add(i);
}
Parallel.ForEach(GetCombinations(list),(t,state,l)=>
Console.WriteLine("{0},{1},{2}",l,t.Item1,t.Item2));
}
static IEnumerable<Tuple<int,int>> GetCombinations(List<int> list)
{
for(int i=0;i<list.Count;i++)
for(int j=0;j<list.Count;j++)
yield return Tuple.Create(list[i],list[j]);
}