I'm parallelizing a quite complex program to get it faster. For this I use most of the time the ExecutorService
. Until now it worked pretty well, but then I noticed that just one line of code makes my program run half as fast as it could. It's the line with exactScore.get()
.
I don't know why, but it sometimes needs more that 0.1 s just to get the double value of the Future Object.
Why is this? How can I handle it that it runs faster? Is there a way to write directly in the Double[]
while multithreading?
Thanks
int processors = Runtime.getRuntime().availableProcessors();
ExecutorService service = Executors.newFixedThreadPool(processors);
// initialize output
Double[] presortedExScores = new Double[sortedHeuScores.length];
for(int i =0; i < sortedHeuScores.length; i++ ){
final int index = i;
final Collection<MolecularFormula> formulas_for_exact_method = multimap.get(sortedHeuScores[i]);
for (final MolecularFormula formula : formulas_for_exact_method){
Future<Double> exactScore = service.submit(new Callable<Double>() {
@Override
public Double call() throws Exception {
return getScore(computeTreeExactly(computeGraph(formula)));
}
});
presortedExScores[index] = exactScore.get();
}
}
That is to be expected. It isn't "slower" then; it is just doing its job.
From the javadoc for get():
Waits if necessary for the computation to complete, and then retrieves its result.
Long story short: it seems that you do not understand the concepts you are using in your code. The idea of a Future is that it does things at some point in the future.
And by calling
get()
you express: I don't mind waiting now until the results of that computation "behind" that Future become available.Thus: you have to step back and look into your code again; to understand how your different "threads of activity" really work; and how/when they come back together.
One idea that comes to mind: right now, you you are creating your Future objects in a loop; and directly after you created the Future, you call
get()
on it. That completely contradicts the idea of creating multiple Futures. In other words: instead of going:you could do something like
In other words: allow your futures to really do things in parallel; instead of enforcing sequential processing.
If that doesn't help "enough", then as said: you have to look at your overall design, and determine if there are ways to further "pull apart" things. Right now all activity happens "closely" together; and surprise: when you do a lot of work at the same time, that takes time. But as you might guess: such a re-design could be a lot of work; and is close to impossible without knowing more about your problem/code base.
A more sophisticated approach would be that you write code where some each Future has a way of expressing "I am done" - then you would "only" start all Futures; and wait until the last one comes back. But as said; I can't design a full solution for you here.
The other really important take-away here: don't just blindly use some code that "happens" to work. One essence of programming is to understand each and any concept used in your source code. You should have a pretty good idea what your code is doing before running it and finding "oh, that
get()
makes things slow".