I am trying to execute parallel functions on an list of objects using the new C# 4.0 Parallel.ForEach
function. This is a very long maintenance process. I would like to make it execute in the order of the list so that I can stop and continue execution at the previous point. How do I do this?
Here is an example. I have a list of objects: a1 to a100
. This is the current order:
a1, a51, a2, a52, a3, a53...
I want this order:
a1, a2, a3, a4...
I am OK with some objects being run out of order, but as long as I can find a point in the list where I can say that all objects before this point were run. I read the parallel programming csharp whitepaper and didn't see anything about it. There isn't a setting for this in the ParallelOptions
class.
Do something like this:
You can see how when you break out of the parallel for loop you will know the last list item to be executed, assuming you let all threads finish prior to breaking. I'm not a big fan of PLINQ or LINQ. I honestly don't see how writing LINQ/PLINQ leads to maintainable source code or readability.... Parallel.For is a much better solution.
If you use
Parallel.Break
to terminate the loop then you are guarenteed that all indices below the returned value will have been executed. This is about as close as you can get. The example here uses For but ForEach has similar overloads.In a ForEach loop, an iteration index is generated internally for each element in each partition. Execution takes place out of order but after break you know that all the iterations lower than
LowestBreakIteration
will have been completed.Taken from "Parallel Programming with Microsoft .NET" http://parallelpatterns.codeplex.com/
Available on MSDN. See http://msdn.microsoft.com/en-us/library/ff963552.aspx. The section "Breaking out of loops early" covers this scenario.
See also: http://msdn.microsoft.com/en-us/library/dd460721.aspx
Not sure if question was altered as my comment seems wrong.
Here improved, basically remind that parallel jobs run in out of your control order.
ea printing 10 numbers might result in 1,4,6,7,2,3,9,0.
If you like to stop your program and continue later.
Problems alike this usually endup in batching workloads.
And have some logging of what was done.
Say if you had to check 10.000 numbers for prime or so.
You could loop in batches of size 100, and have a prime log1, log2, log3
log1= 0..99
log2=100..199
Be sure to set some marker to know if a batch job was finished.
Its a general aprouch since the question isnt that exact either.
For anybody looking for a simple solution, I have posted 2 extension methods (one using PLINQ and one using
Parallel.ForEach
) as part of an answer to the following question:Ordered PLINQ ForAll
As an alternate suggestion, you could record which object have been run and then filter the list when you resume exection to exclude the objects which have already run.
If this needs to be persistent across application restarts, you can store the ID's of the already executed objects (I assume here the objects have some unique identifier).
For anyone else who comes across this question - if you're looping over an array or list (rather than an IEnumberable ), you can use the overload of Parallel.Foreach that gives the element index to maintain original order too.