I want to make this code parallel:
std::vector<float> res(n,0);
std::vector<float> vals(m);
std::vector<float> indexes(m);
// fill indexes with values in range [0,n)
// fill vals and indexes
for(size_t i=0; i<m; i++){
res[indexes[i]] += //something using vas[i];
}
In this article it's suggested to use:
#pragma omp parallel for reduction(+:myArray[:6])
In this question the same approach is proposed in the comments section.
I have two questions:
- I don't know
m
at compile time, and from these two examples it seems that's required. Is it so? Or if I can use it for this case, what do I have to replace?
with in the following command#pragma omp parallel for reduction(+:res[:?])
?m
orn
? - Is it relevant that the indexes of the
for
are relative toindexes
andvals
and not tores
, especially considering thatreduction
is done on the latter one?
However, If so, how can I solve this problem?
It is fairly straight forward to do a user declared reduction for C++ vectors of a specific type:
1a) Not knowing
m
at compile time is not a requirement.1b) You cannot use the array section reduction on
std::vector
s, because they are not arrays (andstd::vector::data
is not an identifier). If it were possible, you'd have to usen
, as this is the number of elements in the array section.2) As long as you are only reading
indexes
andvals
, there is no issue.Edit: The original
initializer
caluse was simpler:initializer(omp_priv = omp_orig)
. However, if the original copy is then not full of zeroes, the result will be wrong. Therefore, I suggest the more complicated initializer which always creates zero-element vectors.