For example: I have a function consisting of a while loop (this one would check for primes)
function isprime(int number){
int i=2;
int max= (int) sqrt(number)+1;
while(i<max){
if(number%i==0){
return 0;
}
i++;
}
return 1;
}
I know this would be a very poor algorithm for testing primes, but my question focuses more on loops anyway. Currently a forth of the function's operations are "just" checking the condition. For larger numbers, this can be an awful lot.
(Quick Example 1: If "number" is 1009, that would be checking the while condition 30 times, 30 operations for the index, and 29*2 operations for the if condition. That's 118 operations)
I realize that I can just cut&paste within the while condition and having the the index pass the maximum, while resulting in additional operations, doesn't falsify the return value. So if I cut everything starting from "if..." to "i++;" and paste it three (or n) times, checking the while condition would only take up 1/10 (or 1/(1+3n) of the operations, while creating a maximum of +2*3(or +(n-1)*3) unnecessary operations.
(Quick Example 2: If "number" is 1009, that would mean checking the while condition 11 times, 33 operations for the index, and 33*2 operations for the if condition. That's 100 operations, 13 less)
As I am currently experimenting with very big numbers (in layman's terms: "the condition will be false for a very, very, very long time") so pasting the if condition and the increment thousands of times would be useful, but very impractical - so my question is:
Is there a tool (or a technique I am missing) to do that for me, but keeps the code clearly and easy to modify?
Thanks in advance!
There is a weird snippet called Duff's Device which is applicable to the case where you know in advance how many iterations should occur.
So for example, let's suppose you have this loop:
With Duff's Device, you can transform it to only check every 4 iterations like so:
It thus combines manual unrolling with compensating for the fact that the number of iterations might not be a multiple of the number of times you unroll. There are more explanations here.
Though personally, I would advise using a slightly clearer format:
Note that you can process the leftover either before or after actually, it does not matter.
With that being said, Loop Unrolling is a basic compiler optimization; so before using such a weird implement, do check whether your compiler would not be already doing it for you...
Your question is a bit unclear.
First, you could change slightly your algorithm; e.g. incrementing by 2, not by 1 (since every prime number above 2 is odd).
Then, when asked to optimize (e.g. with
g++ -Wall -O2
), a compiler generally do some loop unrolling, as if it "copied" (and constant-folded) a loop's body several times.With GCC, you might help the optimizer by e.g. using __builtin_expect, e.g. with
and then you'll code
At last, manually optimizing your code might not worth the pain, you should benchmark. (The CPU cache is very important today, so unrolling too much might slow down the code, see also
__builtin_prefetch
in GCC and this).You could also consider meta-programming and multi-staged programming facilities; in languages like Meta Ocaml or Common Lisp meta-programming means much more than C++11
template
things. You could consider generating C code at runtime (thendlopen
ing it), or using JIT compilation libraries likelibjit
(e.g. do partial evaluation). Read also J.Pitrat's blog about meta-combinatorial search, and wikipage on eval. My MELT system shows that these techniques can (painfully) be used in relation with C++ (MELT generates C++ code at runtime so do meta-programming this way).