I am currently working on a project, where every cycle counts. While profiling my application I discovered that the overhead of some inner loop is quite high, because they consist of just a few machine instruction. Additionally the number of iterations in these loops is known at compile time.
So I thought instead of manually unrolling the loop with copy & paste I could use macros to unroll the loop at compile time so that it can be easily modified later.
What I image is something like this:
#define LOOP_N_TIMES(N, CODE) <insert magic here>
So that I can replace for (int i = 0; i < N, ++i) { do_stuff(); }
with:
#define INNER_LOOP_COUNT 4
LOOP_N_TIMES(INNER_LOOP_COUNT, do_stuff();)
And it unrolls itself to:
do_stuff(); do_stuff(); do_stuff(); do_stuff();
Since the C preprocessor is still a mystery to me most of the time, I have no idea how to accomplish this, but I know it must be possible because Boost seems to have a BOOST_PP_REPEAT
macros. Unfortunately I can't use Boost for this project.
You can use the pre-processor and play some tricks with token concatenation and multiple macro expansion, but you have to hard-code all possibilities:
And then expand it like this:
This method requires literal numbers as counts, you can't do something like this:
There's no standard way of doing this.
Here's a slightly bonkers approach:
It's not too much to ask of an optimizer to eliminate dead code. In which case:
You can't use a #define construct to calculate the "unroll-count". But with sufficient macros you can define this:
Tested with VC2012
You can use templates to unroll. See the disassembly for the sample Live on Godbolt
But
-funroll-loops
has the same effect for this sample.Live On Coliru
You can't write real recursive statements with macros and I'm pretty sure you can't have real iteration in macros as well.
However you can take a look at Order. Although it is entirely built atop the C preprocessor it "implements" iteration-like functionalities. It actually can have up-to-N iterations, where N is some large number. I'm guessing it's similar for "recursive" macros. Any way, it is such a borderline case that few compilers support it (GCC is one of them, though).