I've been using std::memcpy
to circumvent strict aliasing for a long time.
For example, inspecting a float
, like this:
float f = ...;
uint32_t i;
static_assert(sizeof(f)==sizeof(i));
std::memcpy(&i, &f, sizeof(i));
// use i to extract f's sign, exponent & significand
However, this time, I've checked the standard, I haven't found anything that validates this. All I found is this:
For any object (other than a potentially-overlapping subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes ([intro.memory]) making up the object can be copied into an array of char, unsigned char, or std::byte ([cstddef.syn]).40 If the content of that array is copied back into the object, the object shall subsequently hold its original value. [ Example:
#define N sizeof(T) char buf[N]; T obj; // obj initialized to its original value std::memcpy(buf, &obj, N); // between these two calls to std::memcpy, obj might be modified std::memcpy(&obj, buf, N); // at this point, each subobject of obj of scalar type holds its original value
— end example ]
and this:
For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a potentially-overlapping subobject, if the underlying bytes ([intro.memory]) making up obj1 are copied into obj2,41 obj2 shall subsequently hold the same value as obj1. [ Example:
T* t1p; T* t2p; // provided that t2p points to an initialized object ... std::memcpy(t1p, t2p, sizeof(T)); // at this point, every subobject of trivially copyable type in *t1p contains // the same value as the corresponding subobject in *t2p
— end example ]
So, std::memcpy
ing a float
to/from char[]
is allowed, and std::memcpy
ing between the same trivial types is allowed too.
Is my first example (and the linked answer) well defined? Or the correct way to inspect a float
is to std::memcpy
it into a unsigned char[]
buffer, and using shift
s and or
s to build a uint32_t
from it?
Note: looking at std::memcpy
's guarantees may not answer this question. As far as I know, I could replace std::memcpy
with a simple byte-copy loop, and the question will be the same.
The standard may fail to say properly that this is allowed, but it's almost certainly supposed to be, and to the best of my knowledge, all implementations will treat this as defined behaviour.
In order to facilitate the copying into an actual
char[N]
object, the bytes making up thef
object can be accessed as if they were achar[N]
. This part, I believe, is not in dispute.Bytes from a
char[N]
that represent auint32_t
value may be copied into anuint32_t
object. This part, I believe, is also not in dispute.Equally undisputed, I believe, is that e.g.
fwrite
may have written the bytes in one run of the program, andfread
may have read them back in another run, or even another program entirely.Because of that last part, I believe it does not matter where the bytes came from, as long as they form a valid representation of some
uint32_t
object. You could have cycled through allfloat
values, usingmemcmp
on each until you got the representation you wanted, that you knew would be identical to that of theuint32_t
value you're interpreting it as. You could even have done that in another program, a program that the compiler has never seen. That would have been valid.If from the implementation's perspective, your code is indistinguishable from unambiguously valid code, your code must be seen as valid.
Your example is well-defined and does not break strict aliasing.
std::memcpy
clearly states:The standard allows aliasing any type through a
(signed/unsigned) char*
orstd::byte
and thus your example doesn't exhibit UB. If the resulting integer is of any value is another question though.This however, is not guaranteed by the standard as the value of a
float
is implementation-defined (in the case of IEEE 754 it will work though).The behaviour isn't undefined (unless the target type has trap representations† that aren't shared by the source type), but the resulting value of the integer is implementation defined. Standard makes no guarantees about how floating point numbers are represented, so there is no way to extract mantissa etc from the integer in portable way - that said, limiting yourself to IEEE 754 using systems doesn't limit you much these days.
Problems for portability:
You can use
std::numeric_limits::is_iec559
to verify whether your assumption about representation is correct.† Although, it appears that
uint32_t
can't have traps (see comments) so you needn't be concerned. By usinguint32_t
, you've already ruled out portability to esoteric systems - standard conforming systems are not require to define that alias.