This is the fast inverse square root implementation from Quake III Arena:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
I noticed that long int i
takes the dereferenced value at the address (cast to a long *
) of float y
. The code then performs operations on i
before storing the dereferenced value at the address (cast to a float *
) of i
into y
.
Would this break the strict aliasing rule since i
is not the same type as y
?
I think that perhaps it doesn't since the value is dereferenced and copied; so the operations are performed on a copy rather than the original.
i = * ( long * ) &y;
This breaks aliasing rules and therefore invokes undefined behavior.
You are accessing object
y
with a type different thanfloat
, or a signed / unsigned variant ofchar
.y = * ( float * ) &i;
This statement above also breaks aliasing rules.
Yes, it breaks aliasing rules.
In modern C, you can change
i = * (long *) &y;
to:and
y = * (float *) &i;
to:Provided you have guarantees that, in the C implementation being used,
long
andfloat
have suitable sizes and representations, then the behavior is defined by the C standard: The bytes of the object of one type will be reinterpreted as the other type.Yes, this code is badly broken and invokes undefined behavior. In particular, notice these two lines:
Since the object
*(long *)&y
has typelong
, the compiler is free to assume it cannot alias an object of typefloat
; thus, the compiler could reorder these two operations with respect to one another.To fix it, a union should be used.
Yes, it breaks aliasing rules.
The cleanest fix for things like
i = * ( long * ) &y;
would be this:It avoids issues with alignment and aliasing. And with optimization enabled, the call to
memcpy()
should normally be replaced with just a few instructions.Just as any other method suggested, this approach does not fix any problems related to trap representations. On most platforms, however, there are no trap representations in integers and if you know your floating point format you can avoid floating point format trap representations, if there are any.