I'm looking for the fastest way to determine if a long
value is a perfect square (i.e. its square root is another integer):
- I've done it the easy way, by using the built-in
Math.sqrt()
function, but I'm wondering if there is a way to do it faster by restricting yourself to integer-only domain. - Maintaining a lookup table is impractical (since there are about 231.5 integers whose square is less than 263).
Here is the very simple and straightforward way I'm doing it now:
public final static boolean isPerfectSquare(long n)
{
if (n < 0)
return false;
long tst = (long)(Math.sqrt(n) + 0.5);
return tst*tst == n;
}
Note: I'm using this function in many Project Euler problems. So no one else will ever have to maintain this code. And this kind of micro-optimization could actually make a difference, since part of the challenge is to do every algorithm in less than a minute, and this function will need to be called millions of times in some problems.
I've tried the different solutions to the problem:
- After exhaustive testing, I found that adding
0.5
to the result of Math.sqrt() is not necessary, at least not on my machine. - The fast inverse square root was faster, but it gave incorrect results for n >= 410881. However, as suggested by BobbyShaftoe, we can use the FISR hack for n < 410881.
- Newton's method was a good bit slower than
Math.sqrt()
. This is probably becauseMath.sqrt()
uses something similar to Newton's Method, but implemented in the hardware so it's much faster than in Java. Also, Newton's Method still required use of doubles. - A modified Newton's method, which used a few tricks so that only integer math was involved, required some hacks to avoid overflow (I want this function to work with all positive 64-bit signed integers), and it was still slower than
Math.sqrt()
. - Binary chop was even slower. This makes sense because the binary chop will on average require 16 passes to find the square root of a 64-bit number.
- According to John's tests, using
or
statements is faster in C++ than using aswitch
, but in Java and C# there appears to be no difference betweenor
andswitch
. - I also tried making a lookup table (as a private static array of 64 boolean values). Then instead of either switch or
or
statement, I would just sayif(lookup[(int)(n&0x3F)]) { test } else return false;
. To my surprise, this was (just slightly) slower. This is because array bounds are checked in Java.
Project Euler is mentioned in the tags and many of the problems in it require checking numbers >> 2^64. Most of the optimizations mentioned above don't work easily when you are working with an 80 byte buffer.
I used java BigInteger and a slightly modified version of Newton's method, one that works better with integers. The problem was that exact squares n^2 converged to (n-1) instead of n because n^2-1 = (n-1)(n+1) and the final error was just one step below the final divisor and the algorithm terminated. It was easy to fix by adding one to the original argument before computing the error. (Add two for cube roots, etc.)
One nice attribute of this algorithm is that you can immediately tell if the number is a perfect square - the final error (not correction) in Newton's method will be zero. A simple modification also lets you quickly calculate floor(sqrt(x)) instead of the closest integer. This is handy with several Euler problems.
It ought to be possible to pack the 'cannot be a perfect square if the last X digits are N' much more efficiently than that! I'll use java 32 bit ints, and produce enough data to check the last 16 bits of the number - that's 2048 hexadecimal int values.
...
Ok. Either I have run into some number theory that is a little beyond me, or there is a bug in my code. In any case, here is the code:
and here are the results:
(ed: elided for poor performance in prettify.js; view revision history to see.)
I don't know if this has been mentioned before. But i found a solution here:
The following simplification of maaartinus's solution appears to shave a few percentage points off the runtime, but I'm not good enough at benchmarking to produce a benchmark I can trust:
It would be worth checking how omitting the first test,
would affect performance.
Regarding the Carmac method, it seems like it would be quite easy just to iterate once more, which should double the number of digits of accuracy. It is, after all, an extremely truncated iterative method -- Newton's, with a very good first guess.
Regarding your current best, I see two micro-optimizations:
I.e:
Even better might be a simple
Obviously, it would be interesting to know how many numbers get culled at each checkpoint -- I rather doubt the checks are truly independent, which makes things tricky.
"I'm looking for the fastest way to determine if a long value is a perfect square (i.e. its square root is another integer)."
The answers are impressive, but I failed to see a simple check :
check whether the first number on the right of the long it a member of the set (0,1,4,5,6,9) . If it is not, then it cannot possibly be a 'perfect square' .
eg.
4567 - cannot be a perfect square.