This question already has an answer here:
From what I understood about strict aliasing rule, this code for fast inverse square root will result in undefined behavior in C++:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // type punning
i = 0x5f3759df - ( i >> 1 );
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) );
return y;
}
Does this code indeed cause UB? If yes, how can it be reimplemented in standard compliant way? If not, why not?
Assumptions: before calling this function we have somehow checked that floats are in IEEE 754 32-bit format, sizeof(long)==sizeof(float)
and the platform is little-endian.
I don't think you can do this without UB in the strict standardese sense, simply because it relies on things like a particular value representation of
float
andlong
. The standard does not specify these (on purpose), and to give implementations the freedom to do what they see fit in this regard, specifically applies the "UB" label to all behaviour which would depend on such things.That doesn't mean this will be a "Heisenbug" or "nasal demons" type of UB; most likely, the implementation will provide definitions for this behaviour. As long as these definitions satisfy the preconditions of the code, it's fine. The preconditions here being:
sizeof(float) == sizeof(long)
.float
is such that if the bits are interpreted as along
,0x5f3759df
and>> 1
have the meaning reuqired for this hack to function.However, it is impossible to re-write this in a fully portable way (i.e. without UB) - what would you do about platforms which use a different implementation of floating-point arithmetic, for example. As long as you're relying a particular platform's specifics, you're relying on behaviour which is undefined in the standard, but defined by the platform.
You should use
memcpy
. AFAIK this is the only standard compliant way and compilers are smart enough to replace the call with a single word move instruction. See this question for reasoning behind these claims.Not. The algorithm starts out by assuming IEEE754 layout, which isn't a valid Standard-compliant assumption. The particular constant
0x5f3759df
would be utterly wrong for any other layout. Even the assumption that such a constant exists is flawed. I think it breaks as soon as you'd reverse the order of the fields inside a float, for instance.I'm also not sure if
Q_rsqrt(-0.0f)
works correctly even assuming IEEE754. Both zeroes (positive and negative) should be equal andsqrt(x)
is only undefined forx<-0.0f
The standard compliant way would be
std::memcpy
. This should be sufficiently standard-compliant under your specified assumptions. Any reasonable compiler will turn that into a bunch of register moves if possible anyway. Furthermore, we can also alleviate (or at least check) some of your made assumptions using C++11'sstatic_assert
and fixed width integer types from<cstdint>
. Endianness is irrelevant anyway, since we're not working on any arrays here, if an integer type is little-endian, a floating point type is also.