I want to make this code parallel:
std::vector<float> res(n,0);
std::vector<float> vals(m);
std::vector<float> indexes(m);
// fill indexes with values in range [0,n)
// fill vals and indexes
for(size_t i=0; i<m; i++){
res[indexes[i]] += //something using vas[i];
}
In this article it's suggested to use:
#pragma omp parallel for reduction(+:myArray[:6])
In this question the same approach is proposed in the comments section.
I have two questions:
- I don't know
m
at compile time, and from these two examples it seems that's required. Is it so? Or if I can use it for this case, what do I have to replace ?
with in the following command #pragma omp parallel for reduction(+:res[:?])
? m
or n
?
- Is it relevant that the indexes of the
for
are relative to indexes
and vals
and not to res
, especially considering that reduction
is done on the latter one?
However, If so, how can I solve this problem?
It is fairly straight forward to do a user declared reduction for C++ vectors of a specific type:
#include <algorithm>
#include <vector>
#pragma omp declare reduction(vec_float_plus : std::vector<float> : \
std::transform(omp_out.begin(), omp_out.end(), omp_in.begin(), omp_out.begin(), std::plus<float>())) \
initializer(omp_priv = decltype(omp_orig)(omp_orig.size()))
std::vector<float> res(n,0);
#pragma omp parallel for reduction(vec_float_plus : res)
for(size_t i=0; i<m; i++){
res[...] += ...;
}
1a) Not knowing m
at compile time is not a requirement.
1b) You cannot use the array section reduction on std::vector
s, because they are not arrays (and std::vector::data
is not an identifier). If it were possible, you'd have to use n
, as this is the number of elements in the array section.
2) As long as you are only reading indexes
and vals
, there is no issue.
Edit: The original initializer
caluse was simpler: initializer(omp_priv = omp_orig)
. However, if the original copy is then not full of zeroes, the result will be wrong. Therefore, I suggest the more complicated initializer which always creates zero-element vectors.