This is the fast inverse square root implementation from Quake III Arena:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
I noticed that long int i
takes the dereferenced value at the address (cast to a long *
) of float y
. The code then performs operations on i
before storing the dereferenced value at the address (cast to a float *
) of i
into y
.
Would this break the strict aliasing rule since i
is not the same type as y
?
I think that perhaps it doesn't since the value is dereferenced and copied; so the operations are performed on a copy rather than the original.
Yes, this code is badly broken and invokes undefined behavior. In particular, notice these two lines:
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
Since the object *(long *)&y
has type long
, the compiler is free to assume it cannot alias an object of type float
; thus, the compiler could reorder these two operations with respect to one another.
To fix it, a union should be used.
Yes, it breaks aliasing rules.
In modern C, you can change i = * (long *) &y;
to:
i = (union { float f; long l; }) {y} .l;
and y = * (float *) &i;
to:
y = (union { long l; float f; }) {i} .f;
Provided you have guarantees that, in the C implementation being used, long
and float
have suitable sizes and representations, then the behavior is defined by the C standard: The bytes of the object of one type will be reinterpreted as the other type.
Yes, it breaks aliasing rules.
The cleanest fix for things like i = * ( long * ) &y;
would be this:
memcpy(&i, &y, sizeof(i)); // assuming sizeof(i) == sizeof(y)
It avoids issues with alignment and aliasing. And with optimization enabled, the call to memcpy()
should normally be replaced with just a few instructions.
Just as any other method suggested, this approach does not fix any problems related to trap representations. On most platforms, however, there are no trap representations in integers and if you know your floating point format you can avoid floating point format trap representations, if there are any.
i = * ( long * ) &y;
This breaks aliasing rules and therefore invokes undefined behavior.
You are accessing object y
with a type different than float
, or a signed / unsigned variant of char
.
y = * ( float * ) &i;
This statement above also breaks aliasing rules.