As previously established, a union of the form
union some_union {
type_a member_a;
type_b member_b;
...
};
with n members comprises n + 1 objects in overlapping storage: One object for the union itself and one object for each union member. It is clear, that you may freely read and write to any union member in any order, even if reading a union member that was not the last one written to. The strict aliasing rule is never violated, as the lvalue through which you access the storage has the correct effective type.
This is further supported by footnote 95, which explains how type punning is an intended use of unions.
A typical example of the optimizations enabled by the strict aliasing rule is this function:
int strict_aliasing_example(int *i, float *f)
{
*i = 1;
*f = 1.0;
return (*i);
}
which the compiler may optimize to something like
int strict_aliasing_example(int *i, float *f)
{
*i = 1;
*f = 1.0;
return (1);
}
because it can safely assume that the write to *f
does not affect the value of *i
.
However, what happens when we pass two pointers to members of the same union? Consider this example, assuming a typical platform where float
is an IEEE 754 single precision floating point number and int
is a 32 bit two's complement integer:
int breaking_example(void)
{
union {
int i;
float f;
} fi;
return (strict_aliasing_example(&fi.i, &fi.f));
}
As previously established, fi.i
and fi.f
refer to an overlapping memory region. Reading and writing them is unconditionally legal (writing is only legal once the union has been initialized) in any order. In my opinion, the previously discussed optimization performed by all major compilers yields incorrect code as the two pointers of different type legally point to the same location.
I somehow can't believe that my interpretation of the strict aliasing rule is correct. It doesn't seem plausible that the very optimization the strict aliasing was designed for is not possible due to the aforementioned corner case.
Please tell me why I'm wrong.
A related question turned up during research.
Please read all existing answers and their comments before adding your own to make sure that your answer adds a new argument.
The strict aliasing rule forbids access to the same object by two pointers that do not have compatible types, unless one is a pointer to a character type:
In your example,
*f = 1.0;
is modifyingfi.i
, but the types are not compatible.I think the mistake is in thinking that a union contains n objects, where n is the number of members. A union contains only one active object at any point during program execution by §6.7.2.1 ¶16
Support for this interpretation that a union does not simultaneously contain all of its member objects can be found in §6.5.2.3:
Finally, an almost identical issue was raised in defect report 236 in 2006.
Here is note 95 and its context:
Note 95 clearly applies to an access via a union member. Your code does not do that. Two overlapping objects are accessed via pointers to 2 separate types, none of which is a character type, and none of which is a postfix expression pertinent for type punning.
This is not a definitive answer...
Prior to the C89 Standard, the vast majority of implementations defined the behavior of write-dereferencing to pointer of a particular type as setting the bits of the underlying storage in the fashion defined for that type, and defined the behavior of read-dereferencing a pointer of a particular type as reading the bits of the underlying storage in the fashion defined for that type. While such abilities would not have been useful on all implementations, there were many implementations where the performance of hot loops could be greatly improved by e.g. using 32-bit loads and stores to operate on groups of four bytes at once. Further, on many such implementations, supporting such behaviors didn't cost anything.
The authors of the C89 Standard state that one of their objectives was to avoid irreparably breaking existing code, and there are two fundamental ways the rules could have been interpreted consistent with that:
The C89 rules could have been intended to be applicable only in the cases similar to the one given in the rationale (accessing an object with declared type both directly via that type and indirectly via pointer), and where compilers would not have reason to expect aliasing. Keeping track for each variable whether it is currently cached in a register is pretty simple, and being able to keep such variables in registers while accessing pointers of other types is a simple and useful optimization and would not preclude support for code which uses the more common aliasing patterns (having a compiler interpret a
float*
toint*
cast as necessitating a flush of any register-cachedfloat
values is simple and straightforward; such casts are rare enough that such an approach would be unlikely to adversely affect performance).Given that the Standard is generally agnostic with regard to what makes a good-quality implementation for a given platform, the rules could be interpreted as allowing implementations to break code which uses aliasing in ways that would be both useful and obvious, without suggesting that good quality implementations shouldn't try to avoid doing so.
If the Standard defines a practical way of allowing in-place aliasing which is not in any way significantly inferior to other approaches, then approaches other than the defined way might reasonably be regarded as deprecated. If no Standard-defined means exists, then quality implementations for platforms where aliasing is necessary to achieve good performance should endeavor to efficiently support common aliasing platforms on those platforms whether or not the Standard requires them to do so.
Unfortunately, the lack of clarity as to what the Standard requires has resulted in a situation where some people regard as deprecated constructs for which no replacements exist. Having the existence of a complete union type definition involving two primitive types be interpreted as an indication that any access via pointer of one type should be regarded as a likely access to the other would make it possible to adjust programs which rely upon in-place aliasing to do so without Undefined Behavior--something which is not achievable any other practical way given the present Standard. Unfortunately, such an interpretation would also limit many optimizations in the 99% of cases where they would be harmless, thus making it impossible for compilers which interpret the Standard that way to run existing code as efficiently as would otherwise be possible.
As to whether the rule is correctly specified, that would depend upon what it is supposed to mean. Multiple reasonable interpretations are possible, but combining them yields some rather unreasonable results.
PS--the only interpretation of the rules regarding pointer-comparisons and
memcpy
that would make sense without giving the term "object" a meaning different from its meaning in the aliasing rules would suggest that no allocated region can be used to hold more than a single kind of object. While some kinds of code might be able to abide such a restriction, it would make it impossible for programs to use their own memory management logic to recycle storage without excessive numbers of malloc/free calls. The authors of the Standard may have intended to say that implementations are not required to let programmers create a large region and partition it into smaller mixed-type chunks themselves, but that doesn't mean that they intended general-purpose implementations would fail to do so.