C++ standard wording: Does “through all iterators

2019-01-14 21:07发布

问题:

This SO question sparked a discussion about std::generate and the guarantees made by the standard. In particular, can you use function objects with internal state and rely on generate(it1, it2, gen) to call gen(), store the result in *it, call gen() again, store in *(it + 1) etc., or can it start at the back, for example?

The standard (n3337, §25.3.7/1) says this:

Effects: The first algorithm invokes the function object gen and assigns the return value of gen through all the iterators in the range [first,last). The second algorithm invokes the function object gen and assigns the return value of gen through all the iterators in the range [first,first + n) if n is positive, otherwise it does nothing.

It seems like no ordering is guaranteed, especially since other paragraphs have stronger wording, for example std::for_each (Effects: Applies f to the result of dereferencing every iterator in the range [first,last), starting from first and proceeding to last - 1. If we're taking this literally, it only guarantees to start at first and end at last though - no guarantees on the ordering in between).

But: Both Microsoft's and Apache's C++ standard library both give examples on their documentation pages that require the evaluation to be sequential. And both libc++ (in algorithm) and libstdc++ (in bits/stl_algo.h) implement it that way. Moreover, you lose a lot of potential applications for generate without this guarantee.

Does the current wording imply sequentiality? If not, was this an oversight by the members of the committee or intentional?

(I am well aware that there aren't many people who can provide insightful answers to this question without merely speculating or discussing, but in my humble opinion, this does not make this question 'not constructive' as per SO guidelines.)


Thanks to @juanchopanza for pointing out this issue and referring me to the paragraph about for_each.

回答1:

In the discussion of LWG475, std::for_each is compared with std::transform. It's noted that "transform does not guarantee the order in which its function object is called". So, yes, the committee is aware of the lack of sequential guarantees in the standard.

There is no opposite requirement for non-sequential behavior either, so Microsoft and Apache are free to use sequential evaluation.



回答2:

Anywhere the standard doesn't specify an ordering on an algorithm, you should assume that an implementation can exploit that for parallelism. The paper n3408 discusses options for parallelisation, and points to the Thrust library, which is both a usable parallel-enabled reimplementation of the standard algorithms and a proof-of-concept for future standardisation of parallelism in the algorithms.

Looking at Thrust's implementation of generate, it calls gen in a parallel loop whenever the iterator category is random access. As you've observed, this is consistent with the standard, so you should not assume that generate will always be sequential. (For example, a thread-safe std::rand can be efficiently used with generate and does not require sequential invocation.)

The only algorithms that guarantee sequential invocation are those in numeric; if your code depends on sequential invocation, you should use iota in place of generate. Adapting an existing generator:

template<typename F> struct iota_adapter {
   F f;
   operator typename std::result_of<F()>::type() { return f(); }
   void operator++() {}
};
template<typename F> iota_adapter<F> iota_adapt(F &&f) { return {f}; }

Use as:

#include <numeric>
#include <iostream>

int main() {
   int v[5], i = 0;
   std::iota(std::begin(v), std::end(v), iota_adapt([&i]() { return ++i; }));
   for (auto i: v) std::cout << i << '\n';
}