I need a safe way to alias between arbitrary POD types, conforming to ISO-C++11 explicitly considering 3.10/10 and 3.11 of n3242 or later. There are a lot of questions about strict aliasing here, most of them regarding C and not C++. I found a "solution" for C which uses unions, probably using this section
union type that includes one of the aforementioned types among its elements or nonstatic data members
From that I built this.
#include <iostream>
template <typename T, typename U>
T& access_as(U* p)
{
union dummy_union
{
U dummy;
T destination;
};
dummy_union* u = (dummy_union*)p;
return u->destination;
}
struct test
{
short s;
int i;
};
int main()
{
int buf[2];
static_assert(sizeof(buf) >= sizeof(double), "");
static_assert(sizeof(buf) >= sizeof(test), "");
access_as<double>(buf) = 42.1337;
std::cout << access_as<double>(buf) << '\n';
access_as<test>(buf).s = 42;
access_as<test>(buf).i = 1234;
std::cout << access_as<test>(buf).s << '\n';
std::cout << access_as<test>(buf).i << '\n';
}
My question is, just to be sure, is this program legal according to the standard?*
It doesn't give any warnings whatsoever and works fine when compiling with MinGW/GCC 4.6.2 using:
g++ -std=c++0x -Wall -Wextra -O3 -fstrict-aliasing -o alias.exe alias.cpp
* Edit: And if not, how could one modify this to be legal?
I think that at the most fundamental level, this is impossible and violates strict aliasing. The only thing you've achieved is tricking the compiler into not noticing.
This will never be legal, no matter what kind of contortions you perform with weird casts and unions and whatnot.
The fundamental fact is this: two objects of different type may never alias in memory, with a few special exceptions (see further down).
Example
Consider the following code:
Let's break that out into local register variables to model actual execution more closely:
Suppose that (1), (2) and (3) represent a memory read, read and write, respectively, which can be very expensive operations in such a tight inner loop. A reasonable optimization for this loop would be the following:
This optimization reduces the number of memory reads needed by half and the number of memory writes to 1. This can have a huge impact on the performance of the code and is a very important optimization for all optimizing C and C++ compilers.
Now, suppose that we don't have strict aliasing. Suppose that a write to an object of any type can affect any other object. Suppose that writing to a double can affect the value of a float somewhere. This makes the above optimization suspect, because it's possible the programmer has in fact intended for out and in to alias so that the sum function's result is more complicated and is affected by the process. Sounds stupid? Even so, the compiler cannot distinguish between "stupid" and "smart" code. The compiler can only distinguish between well-formed and ill-formed code. If we allow free aliasing, then the compiler must be conservative in its optimizations and must perform the extra store (3) in each iteration of the loop.
Hopefully you can see now why no such union or cast trick can possibly be legal. You cannot circumvent fundamental concepts like this by sleight of hand.
Exceptions to strict aliasing
The C and C++ standards make special provision for aliasing any type with
char
, and with any "related type" which among others includes derived and base types, and members, because being able to use the address of a class member independently is so important. You can find an exhaustive list of these provisions in this answer.Furthermore, GCC makes special provision for reading from a different member of a union than what was last written to. Note that this kind of conversion-through-union does not in fact allow you to violate aliasing. Only one member of a union is allowed to be active at any one time, so for example, even with GCC the following would be undefined behavior:
Workarounds
The only standard way to reinterpret the bits of one object as the bits of an object of some other type is to use an equivalent of
memcpy
. This makes use of the special provision for aliasing withchar
objects, in effect allowing you to read and modify the underlying object representation at the byte level. For example, the following is legal, and does not violate strict aliasing rules:This is semantically equivalent to the following code:
GCC makes a provision for reading from an inactive union member, implicitly making it active. From the GCC documentation:
Placement new
(Note: I'm going by memory here as I don't have access to the standard right now). Once you placement-new an object into a storage buffer, the lifetime of the underlying storage objects ends implicitly. This is similar to what happens when you write to a member of a union:
Now, let's look at something similar with placement-new:
This kind of implicit end-of-lifetime can only occur for types with trivial constructors and destructors, for obvious reasons.
Aside from the error when
sizeof(T) > sizeof(U)
, the problem there could be, that the union has an appropriate and possibly higher alignment thanU
, because ofT
. If you don't instantiate this union, so that its memory block is aligned (and large enough!) and then fetch the member with destination typeT
, it will break silently in the worst case.For example, an alignment error occurs, if you do the C-style cast of
U*
, whereU
requires 4 bytes alignment, todummy_union*
, wheredummy_union
requires alignment to 8 bytes, becausealignof(T) == 8
. After that, you possibly read the union member with typeT
aligned at 4 instead of 8 bytes.Alias cast (alignment & size safe reinterpret_cast for PODs only):
This proposal does explicitly violate strict aliasing, but with static assertions:
Usage with pointer type argument ( like standard cast operators such as reinterpret_cast ):
Another approach (circumvents alignment and aliasing issues using a temporary union):
Usage:
The disadvantage of this is, that the union may require more memory than actually needed for the ReturnType
Wrapped memcpy (circumvents alignment and aliasing issues using memcpy):
For dynamic sized arrays of any POD type:
More efficient approaches may do explicit unaligned reads with intrinsics, like the ones from SSE, to extract primitives.
Examples:
EDITS:
No. The alignment may be unnatural using the alias you have provided. The union you wrote just moves the point of the alias. It may appear to work, but that program may fail when CPU options, ABI, or compiler settings change.
Create natural temporary variables and treat your storage as a memory blob (moving in and out of the blob to/from temporaries), or use a union which represents all your types (remember, one active element at a time here).