Is there a straightforward way to extracting the exponent from a power of 2 using bitwise operations only?
EDIT: Although the question was originally about bitwise operations, the thread is a good read also if you are wondering "What's the fastest way to find X given Y = 2X in Python?"**
I am currently trying to optimize a routine (Rabin-Miller primality test) that reduces an even number N in the forms 2**s * d
. I can get the 2**s
part by:
two_power_s = N & -N
but I can't find a way to extract just "s" with a bitwise operation. Workarounds I am currently testing without too much satisfaction (they are all pretty much slow) are:
- using the logarithm function
- manipulating the binary representation of 2**s (i.e. counting the trailing zeroes)
- looping on a division by 2 until the result is 1
I am using python, but the answer to this question should be language agnostic, I suppose.
There is a page with a lot of these types of tricks and hacks. It's written for C, but many of them should work in Python too (though the performance will obviously be different). The bit you want is here and onwards.
You could try this for example:
That looks like it could be converted to Python quite easily.
"language agnostic" and worrying about performance are pretty much incompatible concepts.
Most modern processors have a CLZ instruction, "count leading zeros". In GCC you can get to it with __builtin_clz(x) (which also produces reasonable, if not the fastest, code for targets that lack clz). Note that this CLZ is undefined for zero, so you'll need an extra branch to catch that case if it matters in your application.
In CELT ( http://celt-codec.org ) the branchless CLZ we use for compliers lacking a CLZ was written by Timothy B. Terriberry:
(The comments indicate that this was found to be faster than a branching version and a lookup table based version)
But if performance is that critical you probably shouldn't be implementing this part of your code in python.
This is actually a comment to the performance test posted by mac. I post this as an answer to have proper code formatting and indenting
mac, could you try a unrolled implementation of the bitseach suggested by Mark Byers? Maybe it's just the array access that slows it down. In theory this approach should be faster than the others.
It would look something like this, although I'm not sure whether the formatting is right for python but I guess you can see what it is supposed to do.
If python shares java's lack of unsingned integers it would need to be something like this:
Short answer
As far as python is concerned:
Preliminary notes
timeit.Timer.repeat(testn, cycles)
wheretestn
was set to 3 andcycles
was automatically adjusted by the script to obtain times in the range of seconds (note: there was a bug in this auto-adjusting mechanism that has been fixed on 18/02/2010).Results
func(25)**
func(231)**
func(2128)**
func(21024)**
Code
You can do it in O(lg s) time for arbitrary length integers using a binsearch.
For fixed size integers, a lookup table should be the fastest solution, and probably best overall.
It seems like the range is known. Let's assume it goes up to 1<<20, just to make it more interesting:
So make a list that (in effect) maps an integer to its base 2 logarithm. The following will do the trick:
(This doesn't do anything useful for numbers that aren't powers of two; the problem statement suggests they don't need to be handled. Would be easy enough to fix that though.)
The function to get the logarithm is very simple, and could easily be inlined:
I can't guarantee that the test code I wrote is exactly the same as the one being used to get the example timings, but this is rather quicker than the
stringcount
code:Since all the values in the table are less than 256, I wondered whether using a string instead of a list would be quicker, or maybe an
array.array
of bytes, but no dice:Using a
dict
to do the lookup is another possibility, taking advantage of the way only powers of two are being checked:The results for this weren't as good, though:
And just for fun one could also use the
hex
method on float objects to get a string that includes (as its last part) the base 2 exponent of the number. This is a bit slow to extract in general, but if the exponent is only ever going to be one digit it could be done straightforwardly enough:This is purely for entertainment value though as it was uncompetetive -- though, amazingly, still quicker than the bitwise approach.
So it looks like using a list is the way to go.
(This approach won't scale indefinitely, due to limited memory, but to make up for that the execution speed won't depend on
max_log2
, or the input values, in any way that you'll notice when running python code. Regarding the memory consumption, if I remember my python internals correctly, the table will take up about(1<<max_log2)*4
bytes, because the contents are all small integers that the interpreter will intern automatically. SO, whenmax_log2
is 20, that's about 4MB.)