I wonder whether for long loops we can take advantage of tail recursion for constexpr in C++11?
问题:
回答1:
By the rules in [implimits]
, an implementation is permitted to put a recursion depth limit on constexpr
calculations. The two compilers which have complete constexpr
implementations (gcc and clang) both apply such a limit, using the default of 512 recursive calls as suggested by the standard. For both of these compilers, as well as any other implementation which follows the standard's suggestion, a tail recursion optimization would be essentially undetectable (unless the compiler would otherwise crash before reaching its recursion limit).
An implementation could instead choose to only count calls for which it could not apply a tail-recursion optimization in its recursion depth limit, or to not provide such a limit. However, such an implementation would probably be doing a disservice to its users, since it would be likely to either crash (due to a stack overflow) or fail to terminate on constexpr
evaluations which recurse deeply or infinitely.
With regard to what happens when the recursion depth limit is reached, Pubby's example raises an interesting point. [expr.const]p2
specifies that
an invocation of a constexpr function or a constexpr constructor that would exceed the implementation-defined recursion limits (see Annex B);
is not a constant expression. Therefore, if the recursion limit is reached in a context which requires a constant expression, the program is ill-formed. If a constexpr
function is called in a context which does not require a constant expression, the implementation is not generally required to attempt to evaluate it at translation time, but if it chooses to, and the recursion limit is reached, it is required to instead perform the call at runtime. On a complete, compilable test program:
constexpr unsigned long long f(unsigned long long n, unsigned long long s=0) {
return n ? f(n-1,s+n) : s;
}
constexpr unsigned long long k = f(0xffffffff);
GCC says:
depthlimit.cpp:4:46: in constexpr expansion of ‘f(4294967295ull, 0ull)’
depthlimit.cpp:2:23: in constexpr expansion of ‘f((n + -1ull), (s + n))’
depthlimit.cpp:2:23: in constexpr expansion of ‘f((n + -1ull), (s + n))’
[... over 500 more copies of the previous message cut ...]
depthlimit.cpp:2:23: in constexpr expansion of ‘f((n + -1ull), (s + n))’
depthlimit.cpp:4:46: error: constexpr evaluation depth exceeds maximum of 512 (use -fconstexpr-depth= to increase the maximum)
and clang says:
depthlimit.cpp:4:30: error: constexpr variable 'k' must be initialized by a constant expression
constexpr unsigned long long k = f(0xffffffff);
^ ~~~~~~~~~~~~~
depthlimit.cpp:2:14: note: constexpr evaluation exceeded maximum depth of 512 calls
return n ? f(n-1,s+n) : s;
^
depthlimit.cpp:2:14: note: in call to 'f(4294966784, 2194728157440)'
depthlimit.cpp:2:14: note: in call to 'f(4294966785, 2190433190655)'
depthlimit.cpp:2:14: note: in call to 'f(4294966786, 2186138223869)'
depthlimit.cpp:2:14: note: in call to 'f(4294966787, 2181843257082)'
depthlimit.cpp:2:14: note: in call to 'f(4294966788, 2177548290294)'
depthlimit.cpp:2:14: note: (skipping 502 calls in backtrace; use -fconstexpr-backtrace-limit=0 to see all)
depthlimit.cpp:2:14: note: in call to 'f(4294967291, 17179869174)'
depthlimit.cpp:2:14: note: in call to 'f(4294967292, 12884901882)'
depthlimit.cpp:2:14: note: in call to 'f(4294967293, 8589934589)'
depthlimit.cpp:2:14: note: in call to 'f(4294967294, 4294967295)'
depthlimit.cpp:4:34: note: in call to 'f(4294967295, 0)'
constexpr unsigned long long k = f(0xffffffff);
^
If we modify the code so that the evaluation is not required to occur at translation time:
constexpr unsigned long long f(unsigned long long n, unsigned long long s=0) {
return n ? f(n-1,s+n) : s;
}
int main(int, char *[]) {
return f(0xffffffff);
}
then both compilers accept it, and generate code which computes the result at runtime. When building with -O0
, this code fails due to stack overflow. When building with -O2
, the compilers' optimizers transform the code to use tail recursion and the code functions correctly (but note that this tail recursion is unrelated to constexpr
evaluation).
回答2:
I don't see why it could not be possible, however it is a quality of implementation detail.
It has been traditional to use memoization for templates for example, so that compilers no longer choke on:
template <size_t N>
struct Fib { static size_t const value = Fib <N-1>::value + Fib<N-2>::value; };
template <>
struct Fib<1> { static size_t const value = 1; }
template <>
struct Fib<0> { static size_t const value = 0; }
but instead memoize the already computed value to reduce the complexity of its evaluation to O(N).
Tail-recursion (and pseudo tail-recursion) are optimizations, and like most optimizations are not subjected to the Standard, so there is no reason it would not be possible. Whether a particular compiler uses it or not, however, is hard to predict.
The Standard says in 5.19 [expr.const]:
2/ A conditional-expression is a core constant expression unless it involves one of the following as a potentially evaluated subexpression (3.2) [...]:
- an invocation of a constexpr function or a constexpr constructor that would exceed the implementationdefined recursion limits (see Annex B);
And reading Annex B:
2/ The limits may constrain quantities that include those described below or others. The bracketed number following each quantity is recommended as the minimum for that quantity. However, these quantities are only guidelines and do not determine compliance.
- Recursive constexpr function invocations [512].
Tail recursion is not brooched.
回答3:
I'm not sure I understand what you're asking. If it concerns whether
the compiler will convert tail recursion into a loop, it's unspecified,
whether the function is a constexpr
or not. If it is whether a
recursive function can be a constexpr
, then I don't think tail
recursion is relevant. If I read the standard correctly:
constexpr unsigned ack( unsigned m, unsigned n )
{
return m == 0
? n + 1
: n == 0
? ack( m - 1, 1 )
: ack( m - 1, ack( m, n - 1 ) );
}
is a valid constexpr (although I expect the compiler will complain about
a lack of resources for all but the smallest n
and m
, at least if
the function is used in a context which requires a constant expression).
回答4:
I've seen GCC perform this optimization. Here's an example:
constexpr unsigned long long fun1(unsigned long long n, unsigned long long sum = 0) {
return (n != 0) ? fun1(n-1,sum+n) : sum;
}
fun1(0xFFFFFFFF);
Works on -O2, crashes otherwise.
Surprisingly, it is also optimizing this:
constexpr unsigned long long fun2(unsigned long long n) {
return (n != 0) ? n + fun2(n-1) : 0;
}
I've check the disassembly of the non-conspexpr form and I can confirm that it is being optimized into a loop.
But not this:
constexpr unsigned long long fun3(unsigned long long n) {
return (n != 0) ? n + fun3(n-1) + fun3(n-1) : 0;
}
So in conclusion, GCC will optimize into a loop the same it does for non-consexpr functions. Use at least -O2 and up.
回答5:
"Tail call" is probably a misnomer to start with. constexpr
functions are much closer to mathematical functions. For mathematical functions, the following two functions are identical:
constexpr unsigned long long fun1(unsigned long long n) {
if (n == 0) return 0 ;
return n + fun1(n-1);
}
constexpr unsigned long long fun2(unsigned long long n) {
if (n != 0) return n + fun2(n-1);
return 0;
}
yet from a procedural programming viewpoint they're definitely not. Only the first would appear to lend itself to tail call optimization.