How can the compile-time be (exponentially) faster

2019-02-02 02:38发布

问题:

The below code calculates Fibonacci numbers by an exponentially slow algorithm:

#include <cstdlib>
#include <iostream>

#define DEBUG(var) { std::cout << #var << ": " << (var) << std::endl; }

constexpr auto fib(const size_t n) -> long long
{
    return n < 2 ? 1: fib(n - 1) + fib(n - 2);
}

int main(int argc, char *argv[])
{
    const long long fib91 = fib(91);

    DEBUG( fib91 );
    DEBUG( fib(45) );

    return EXIT_SUCCESS;
}

And I am calculating the 45th Fibonacci number at run-time, and the 91st one at compile time.

The interesting fact is that GCC 4.9 compiles the code and computes fib91 in a fraction of a second, but it takes a while to spit out fib(45).

My question: If GCC is smart enough to optimize fib(91) computation and not to take the exponentially slow path, what stops it to do the same for fib(45)?

Does the above mean GCC produces two compiled versions of fib function where one is fast and the other exponentially slow?

The question is not how the compiler optimizes fib(91) calculation (yes! It does use a sort of memoization), but if it knows how to optimize the fib function, why does it not do the same for fib(45)? And, are there two separate compilations of the fib function? One slow, and the other fast?

回答1:

GCC is likely memoizing constexpr functions (enabling a Θ(n) computation of fib(n)). That is safe for the compiler to do because constexpr functions are purely functional.

Compare the Θ(n) "compiler algorithm" (using memoization) to your Θ(φn) run time algorithm (where φ is the golden ratio) and suddenly it makes perfect sense that the compiler is so much faster.

From the constexpr page on cppreference (emphasis added):

The constexpr specifier declares that it is possible to evaluate the value of the function or variable at compile time.

The constexpr specifier does not declare that it is required to evaluate the value of the function or variable at compile time. So one can only guess what heuristics GCC is using to choose whether to evaluate at compile time or run time when a compile time computation is not required by language rules. It can choose either, on a case-by-case basis, and still be correct.

If you want to force the compiler to evaluate your constexpr function at compile time, here's a simple trick that will do it.

constexpr auto compute_fib(const size_t n) -> long long
{
    return n < 2 ? n : compute_fib(n - 1) + compute_fib(n - 2);
}

template <std::size_t N>
struct fib
{
    static_assert(N >= 0, "N must be nonnegative.");
    static const long long value = compute_fib(N);
};

In the rest of your code you can then access fib<45>::value or fib<91>::value with the guarantee that they'll be evaluated at compile time.



回答2:

At compile-time the compiler can memoize the result of the function. This is safe, because the function is a constexpr and hence will always return the same result of the same inputs.

At run-time it could in theory do the same. However most C++ programmers would frown at optimization passes that result in hidden memory allocations.



回答3:

When you ask for fib(91) to give a value to your const fib91 in the source code, the compiler is forced to compute that value from you const expr. It does not compile the function (as you seem to think), just it sees that to compute fib91 it needs fib(90) and fib(89), to compute the it needs fib(87)... so on until he computes fib(1) which is given. This is an $O(n)$ algorithm and the result is computed fast enough.

However when you ask to evaluate fib(45) in runtime the compiler has to choose wether using the actual function call or precompute the result. Eventually it decides to use the compiled function. Now, the compiled function must execute exactly the exponential algorithm that you have decided there is no way the compiler could implement memoization to optimize a recursive function (think about the need to allocate some cache and to understand how many values to keep and how to manage them between function calls).