I am currently working on a project, where every cycle counts. While profiling my application I discovered that the overhead of some inner loop is quite high, because they consist of just a few machine instruction. Additionally the number of iterations in these loops is known at compile time.
So I thought instead of manually unrolling the loop with copy & paste I could use macros to unroll the loop at compile time so that it can be easily modified later.
What I image is something like this:
#define LOOP_N_TIMES(N, CODE) <insert magic here>
So that I can replace for (int i = 0; i < N, ++i) { do_stuff(); }
with:
#define INNER_LOOP_COUNT 4
LOOP_N_TIMES(INNER_LOOP_COUNT, do_stuff();)
And it unrolls itself to:
do_stuff(); do_stuff(); do_stuff(); do_stuff();
Since the C preprocessor is still a mystery to me most of the time, I have no idea how to accomplish this, but I know it must be possible because Boost seems to have a BOOST_PP_REPEAT
macros. Unfortunately I can't use Boost for this project.
You can use templates to unroll.
See the disassembly for the sample Live on Godbolt
But -funroll-loops
has the same effect for this sample.
Live On Coliru
template <unsigned N> struct faux_unroll {
template <typename F> static void call(F const& f) {
f();
faux_unroll<N-1>::call(f);
}
};
template <> struct faux_unroll<0u> {
template <typename F> static void call(F const&) {}
};
#include <iostream>
#include <cstdlib>
int main() {
srand(time(0));
double r = 0;
faux_unroll<10>::call([&] { r += 1.0/rand(); });
std::cout << r;
}
You can use the pre-processor and play some tricks with token concatenation and multiple macro expansion, but you have to hard-code all possibilities:
#define M_REPEAT_1(X) X
#define M_REPEAT_2(X) X X
#define M_REPEAT_3(X) X X X
#define M_REPEAT_4(X) X X X X
#define M_REPEAT_5(X) X M_REPEAT_4(X)
#define M_REPEAT_6(X) M_REPEAT_3(X) M_REPEAT_3(X)
#define M_EXPAND(...) __VA_ARGS__
#define M_REPEAT__(N, X) M_EXPAND(M_REPEAT_ ## N)(X)
#define M_REPEAT_(N, X) M_REPEAT__(N, X)
#define M_REPEAT(N, X) M_REPEAT_(M_EXPAND(N), X)
And then expand it like this:
#define THREE 3
M_REPEAT(THREE, three();)
M_REPEAT(4, four();)
M_REPEAT(5, five();)
M_REPEAT(6, six();)
This method requires literal numbers as counts, you can't do something like this:
#define COUNT (N + 1)
M_REPEAT(COUNT, stuff();)
There's no standard way of doing this.
Here's a slightly bonkers approach:
#define DO_THING printf("Shake it, Baby\n")
#define DO_THING_2 DO_THING; DO_THING
#define DO_THING_4 DO_THING_2; DO_THING_2
#define DO_THING_8 DO_THING_4; DO_THING_4
#define DO_THING_16 DO_THING_8; DO_THING_8
//And so on. Max loop size increases exponentially. But so does code size if you use them.
void do_thing_25_times(void){
//Binary for 25 is 11001
DO_THING_16;//ONE
DO_THING_8;//ONE
//ZERO
//ZERO
DO_THING;//ONE
}
It's not too much to ask of an optimizer to eliminate dead code.
In which case:
#define DO_THING_N(N) if(((N)&1)!=0){DO_THING;}\
if(((N)&2)!=0){DO_THING_2;}\
if(((N)&4)!=0){DO_THING_4;}\
if(((N)&8)!=0){DO_THING_8;}\
if(((N)&16)!=0){DO_THING_16;}
You can't use a #define construct to calculate the "unroll-count". But with sufficient macros you can define this:
#define LOOP1(a) a
#define LOOP2(a) a LOOP1(a)
#define LOOP3(a) a LOOP2(a)
#define LOOPN(n,a) LOOP##n(a)
int main(void)
{
LOOPN(3,printf("hello,world"););
}
Tested with VC2012
You can't write real recursive statements with macros and I'm pretty sure you can't have real iteration in macros as well.
However you can take a look at Order. Although it is entirely built atop the C preprocessor it "implements" iteration-like functionalities. It actually can have up-to-N iterations, where N is some large number. I'm guessing it's similar for "recursive" macros. Any way, it is such a borderline case that few compilers support it (GCC is one of them, though).