static pre-calculation optimization in clang

2019-07-19 06:25发布

问题:

I have

#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
#include <stdio.h>
#include <math.h>

int fib(int n) {
    return n < 2 ? n : fib(n-1) + fib(n-2);
}

double clock_now()
{
    struct timeval now;

    gettimeofday(&now, NULL);
    return (double)now.tv_sec + (double)now.tv_usec/1.0e6;
}

#define NITER 5

and in my main(), I'm doing a simple benchmark like this:

printf("hi\n");
double t = clock_now();
int f = 0;
double tmin = INFINITY;
for (int i=0; i<NITER; ++i) {
    printf("run %i, %f\n", i, clock_now()-t);
    t = clock_now();
    f += fib(40);
    t = clock_now()-t;
    printf("%i %f\n", f, t);
    if (t < tmin) tmin = t;
    t = clock_now();
}
printf("fib,%.6f\n", tmin*1000);

When I compile with clang -O3 (LLVM 5.0 from Xcode 5.0.1), it always prints out zero time, except at the init of the for-loop, i.e. this:

hi
run 0, 0.866536
102334155 0.000000
run 1, 0.000001
204668310 0.000000
run 2, 0.000000
307002465 0.000000
run 3, 0.000000
409336620 0.000000
run 4, 0.000001
511670775 0.000000
fib,0.000000

It seems that it statically precalculates the fib(40) and stores it somewhere. Right? The strange lag at the beginning (0.8 secs) is probably because it loads that cache?

I'm doing this for benchmarking. The C compiler should optimize fib() itself as much as it can. However, I don't want it to precalculate it already at compile time. So basically I want all code optimized as heavily as possible, but not main() (or at least not this specific optimization). Can I do that somehow?

What optimization is it anyway in this specific case? It's somehow strange and quite nice.

回答1:

I found a solution by marking certain data volatile. Esp, what I did was:

volatile int f = 0;
//...
    volatile int FibArg = 40;
    f += fib(FibArg);

That way, it forces the compiler to read FibArg when calling the function and it forces it to not assume that it is constant. Thus it must call the function to calculate it.

The volatile int f was not necessary for my compiler at the moment but it might be in the future when the compiler figures out that fib has no side effects and its result nor f is every used.


Note that this is still not the end. A future compiler could have advanced so far that it guesses that 40 is a likely argument for fib. Maybe it builds a database for likely values. And for the most likely values, it builds up a small cache. And when fib is called, it does a fast runtime-check whether it has that value cached. Of course, the runtime-check adds some overhead but maybe the compiler estimates that this overhead is minor for some particular code in relation to the speed gained by the cached.

I'm not sure if a compiler will ever do such optimization but it could. Profile Guided Optimization (PGO) goes already in that direction.