We are currently trying to implement a red-black Gauss-Seidel solver for a numerical simulation.
For that we divided our simulation area into equal sized subgrids. We are able to asynchronously perform the red-black cycles for the pressure equation on each subgrid with the correct dependencies and a hpx::dataflow
object.
But now we have the following problem: after every n-th cycle we have to perform a residual calculation to determine if we are converged already.
So the optimal solution would be, that we start every local residual calculation separately/asynchronously and then sum over a vector of hpx::future<double>
. With the idea of HPX futures that could lead to an optimal solution, i.e. we sum up all elements as soon as possible.
But so far we were only able to come up with the following code:
#include <hpx/hpx_main.hpp>
#include <hpx/hpx.hpp>
#include <memory>
#include <iostream>
class A {
public:
double residual() {
// Calculate actual local residual
return 1.0;
}
};
int main() {
// Create instances
std::vector<A> vec(3);
std::vector<hpx::shared_future<double>> res(vec.size());
// asynchronous launch resdiual calculation
for (size_t i = 0; i < res.size(); ++i) {
res[i] = hpx::async( &A::residual, &vec[i] );
}
double residual = 0.0;
for (size_t i = 0; i < res.size(); ++i) {
residual += res[i].get();
}
std::cout << "residual: " << residual << std::endl;
return 0;
}
That's far from optimal. In worst case it performance like a global barrier followed by a pure sequential sum over all elements.
So our question is how we could implement this "HPX" like in parallel?
Update 02.02.2019:
We already rewrote our code so that we can't start our residual calculation fully asynchronously, but based on data dependencies via a hpx::dataflow
object.
// asynchronous launch resdiual calculation
for (size_t i = 0; i < res.size(); ++i) {
res[i] = hpx::dataflow( hpx::util::unwrapping(&A::residual), &vec[i], *DEPENDENCIES* );
}
Is it possible to call @Mike van Dyke code with dataflow objects too or is there another solution?
(Hint: I didn't got your code working because of a template argument deduction/substitution failed
error)
You can use the
transform_reduce
pattern for what you want to achieve:This code will calculate the residual (3) of each
A
invec
(1), then will sum up all results (2).