[There are a few questions on this but none of the answers are particularly definitive and several are out of date with the current C++ standard].
My research shows these are the principal methods used to check if a floating point value can be converted to an integral type T
.
if (f >= std::numeric_limits<T>::min() && f <= std::numeric_limits<T>::max() && f == (T)f))
using
std::fmod
to extract the remainder and test equality to 0.using
std::remainder
and test equality to 0.
The first test assumes that a cast from f
to a T
instance is defined. Not true for std::int64_t
to float
, for example.
With C++11, which one is best? Is there a better way?
Conclusion:
The answer is use
std::trunc(f) == f
the time difference is insignificant when comparing all these methods. Even if the specific IEEE unwinding code we write in the example below is technically twice is fast we are only talking about 1 nano second faster.The maintenance costs in the long run though would be significantly higher. So use a solution that is easier to read and understand by the maintainer is better.
Time in microseconds to complete 12,000,000 operations on a random set of numbers:
std::trunc(f) == f
32std::floor(val) - val == 0
35((uint64_t)f) - f) == 0.0
38std::fmod(val, 1.0) == 0
87The Working out of the conclusion.
A floating point number is two parts:
So the exponent is basically how many binary digits we are going to shift the "binary point" down the mantissa. A positive value shifts it right a negative value shifts it left. If all the digits to the right of the binary point are zero then we have an integer.
If we assume IEEE 754
We should note that this representation the value is normalized so that the most significant bit in the mantissa is shifted to be 1. Since this bit is always set it is not actually stored (the processor knows its there and compensates accordingly).
So:
If the
exponent < 0
then you definitely do not have an integer as it can only be representing a fractional value. If theexponent >= <Number of bits In Mantissa>
then there is definately no fractual part and it is an integer (though you may not be able to hold it in an ant.Otherwise we have to do some work. if the
exponent >= 0 && exponent < <Number of bits In Mantissa>
then you may by representing an integer if themantissa
is all zero in the bottom half (defined below).Additional as part of the normalization 127 is added to the exponent (so that there are no negative values stored in the 8 bit exponent field).
Results:
Running Some Timing tests.
Test data was generated like this:
Modified main() to run tests:
The tests show:
The IEEE code (in this simple test) seem to beat the truncate method and generate the same result. BUT the amount of time is insignificant. Over 12 Million calls we saw a differince in 14 milli seconds.
I'd go deep into the IEE 754 standard and keep thinking only in terms of this type and I'll be assuming 64 bit integers and doubles.
The number is a whole number iff:
I made following function:
As a test of this function:
The result is following:
The condition in the method is not very clear, thus I'm posting the less obfuscated version with commented if/else structure.
Personally I would recommend using the
trunc
function introduced in C++11 to check iff
is integral:It involves no casting and no floating point arithmetics both of which can be a source of error. The truncation of the decimal places can surely be done without introducing a numerical error by setting the corresponding bits of the mantissa to zero at least if the floating point values are represented according to the IEEE 754 standard.
Personally I would hesitate to use
fmod
orremainder
for checking whetherf
is integral because I am not sure whether the result can underflow to zero and thus fake an integral value. In any case it is easier to show thattrunc
works without numerical error.None of the three above methods actually checks whether the floating point number
f
can be represented as a value of typeT
. An extra check is necessary.The first option actually does exactly that: It checks whether
f
is integral and can be represented as a value of typeT
. It does so by evaluatingf == (T)f
. This check involves a cast. Such a cast is undefined according to §1 in section 4.9 of the C++11 standard "if the truncated value cannot be represented in the destination type". Thus iff
is e.g. larger or equal tostd::numeric_limits<T>::max()+1
the truncated value will certainly have an undefined behavior as a consequence.That is probably why the first option has an additional range check (
f >= std::numeric_limits<T>::min() && f <= std::numeric_limits<T>::max()
) before performing the cast. This range check could also be used for the other methods (trunc
,fmod
,remainder
) in order to determine whetherf
can be represented as a value of typeT
. However, the check is flawed since it can run into undefined behavior: In this check the limitsstd::numeric_limits<T>::min/max()
get converted to the floating point type for applying the equality operator. For example ifT=uint32_t
andf
being afloat
,std::numeric_limits<T>::max()
is not representable as a floating point number. The C++11 standard then states in section 4.9 §2 that the implementation is free to choose the next lower or higher representable value. If it chooses the higher representable value andf
happens to be equal to the higher representable value the subsequent cast is undefined according to §1 in section 4.9 since the (truncated) value cannot be represented in the destination type (uint32_t).Consequently, the first option would establish that
f
is integral and representable asuint32_t
even though it is not.Fixing the range check in general is not easy. The fact that signed integers and floating point numbers do not have a fixed representation (such as two's complement or IEEE 754) according to the standard do not make things easier. One possibility is to write non-portable code for the specific compiler, architecture and types you use. A more portable solution is to use Boost's NumericConversion library:
Then you can finally perform the cast safely:
Here is what I would try:
First of all, I want to see if I got your question right. From what I've read, it seems that you want to determine if a floating-point is actually simply a representation of an integral type in floating-point.
As far as I know, performing
==
on a floating-point is not safe due to floating-point inaccuracies. Therefore I am proposing the following solution,The idea is to simply find the absolute difference between the original floating-point and the floating-point casted to the integral type, and then determine if it is smaller than the epsilon of the floating-point type. I'm assuming here that if it is smaller than epsilon, the difference is of no importance to us.
Thank you for reading.
Use
modf()
which breaks the value into integral and fractional parts. From this direct test, it is known if thedouble
is a whole number or not. After this, limit tests against the min/max of the target integer type can be done.Note
modf()
differs from the similar namedfmod()
.Of the 3 methods OP posted, the cast to/from an integer may perform a fair amount of work doing the casts and compare. The other 2 are marginally the same. They work, assuming no unexpected rounding mode effects from dividing by 1.0. But do an unnecessary divide.
As to which is fastest likely depends on the mix of
double
s used.OP's first method has a singular advantage: Since the goal is to test if a FP may convert exactly to a some integer, and likely then if the result is true, the conversion needs to then occur, OP's first method has already done the conversion.