I'm building a large ParallelTable
, and would like to maintain some sense of how the computation is going. For a non parallel table the following code does a great job:
counter = 1;
Timing[
Monitor[
Table[
counter++
, {n, 10^6}];
, ProgressIndicator[counter, {0, 10^6}]
]
]
with the result {0.943512, Null}
. For the parallel case, however, it's necessary to make the counter
shared between the kernels:
counter = 1;
SetSharedVariable[counter];
Timing[
Monitor[
ParallelTable[
counter++
, {n, 10^4}];
, ProgressIndicator[counter, {0, 10^4}]
]
]
with the result {6.33388, Null}
. Since the value of counter
needs to be passed back and forth between the kernels at every update, the performance hit is beyond severe. Any ideas for how to get some sense of how the computation is going? Perhaps letting each kernel have its own value for counter
and summing them at intervals? Perhaps some way of determining what elements of the table have already been farmed out to the kernels?
You nearly gave the answer yourself, when you said "Perhaps letting each kernel have its own value for counter and summing them at intervals?".
Try something like this:
counter = 1;
SetSharedVariable[counter];
ParallelEvaluate[last = AbsoluteTime[]; localcounter = 1;]
Timing[Monitor[
ParallelTable[localcounter++;
If[AbsoluteTime[] - last > 1, last = AbsoluteTime[];
counter += localcounter; localcounter = 0;], {n, 10^6}];,
ProgressIndicator[counter, {0, 10^6}]]]
Note that it takes longer than your first single-CPU case only because it actually does something in the loop.
You can change the test AbsoluteTime[] - last > 1 to something more frequent like AbsoluteTime[] - last > 0.1.
This seems hard to solve. From the manual:
Unless you use shared variables, the parallel evaluations performed
are completely independent and cannot influence each other.
Furthermore, any side effects, such as assignments to variables, that
happen as part of evaluations will be lost. The only effect of a
parallel evaluation is that its result is returned at the end.
However, a rough progress indicator can still be gotten using the old Print
statement:
Another approach is to put a trace on LinkWrite and LinkRead and modify their tracing messages to do some useful accounting.
First, launch some parallel kernels:
LaunchKernels[]
This will have set up the link objects for the parallel kernels.
Then define an init function for link read and write counters:
init[] := Map[(LinkWriteCounter[#] = 0; LinkReadCounter[#] = 0) &, Links[]]
Next, you want to increment these counters when their links are being read from or written to:
Unprotect[Message];
Message[LinkWrite::trace, x_, y_] := LinkWriteCounter[x[[1, 1]]] += 1;
Message[LinkRead::trace, x_, y_] := LinkReadCounter[x[[1, 1]]] += 1;
Protect[Message];
Here, x[[1,1]]
is the LinkObject in question.
Now, turn on tracing on LinkWrite and LinkRead:
On[LinkWrite];
On[LinkRead];
To format the progress display, first shorten the LinkObject display a bit, since they are rather verbose:
Format[LinkObject[k_, a_, b_]] := Kernel[a, b]
And this is a way to display the reads and writes dynamically for the subkernel links:
init[];
Dynamic[Grid[Join[
{{"Kernel", "Writes", "Reads"}},
Map[{#, LinkWriteCounter[#]/2, LinkReadCounter[#]/2} &,
Select[Links[], StringMatchQ[First[#], "*subkernel*"] &
]]], Frame -> All]]
(I'm dividing the counts by two, because every link read and write is traced twice).
And finally test it out with a 10,000 element table:
init[];
ParallelTable[i, {i, 10^4}, Method -> "FinestGrained"];
If everything worked, you should see a final progress display with about 5,000 read and writes for each kernel:
There is medium performance penalty for this: 10.73s without the monitor, and 13.69s with the monitor. And of course using the "FinestGrained" option is not the most optimal method to use for this particular parallel computation.
You can get some ideas from the package Spin`System`LoopControl`
developed by Yuri Kandrashkin:
Announce of the Spin`
package:
Hi group,
I have prepared the package Spin` that consists of several applications
which are designed for research in the area of magnetic resonance and
spin chemistry and physics.
The applications Unit` and LoopControl` can be useful to a broader
audience.
The package and short outline is available at:
http://sites.google.com/site/spinalgebra/.
Sincerely,
Yuri Kandrashkin.