Quote from C99 standard:
6.5.2.3
5 One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the complete type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.
There is example for this case:
// The following code is not a valid fragment because
// the union type is not visible within the function f.
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 *p1, struct t2 *p2)
{
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g()
{
union
{
struct t1 s1;
struct t2 s2;
} u;
/* ... */
return f(&u.s1, &u.s2);
}
I have added few changes:
#include <stdio.h>
struct t1 { int m; };
struct t2 { int m; };
union u
{
struct t1 s1;
struct t2 s2;
};
int foo(struct t1 *p1, struct t2 *p2)
{
if (p1->m)
p2->m = 2;
return p1->m;
}
int main(void)
{
union u u;
u.s1.m = 1;
printf("%d\n", foo(&u.s1, &u.s2));
}
As you can see I have moved union declaration outside so it would be visible in foo(). According to the comment from standard, this should have made my code correct but it looks like strict aliasing still breaks this code for clang 3.4 and gcc 4.8.2.
Output with -O0:
2
Output with -O2:
1
for both compilers.
So my question is:
is C really relies on union declaration to decide if some structures are exception to strict aliasing rule? Or both gcc/clang have a bug?
It seems really broken to me, because even if function and union are both declared in the same header, this does not guarantee that the union is visible in translation unit with body of the function.
The most important point is that your change (moving the union up) is not changing the definition of the function foo
at all. It is still a function that receives unrelated pointers. In your example the passed pointers are related while elsewhere this might be different. The goal of compiler is to serve the most general case. The body of the function is different after the change and it is not clear why.
The question that you are asking is about how careful optimization is implemented in your particular compiler for certain command line keys. It has nothing to do with the memory layout. In a correct compiler the result should be the same. Compiler should handle the case when 2 different pointers in fact point to the same place in memory.
The set of circumstances in which a compiler recognizes that an access to an aggregate member is an access to the aggregate itself is purely a Quality of Implementation issue, and the Standard makes no effort to recognize any cases where use of a non-character lvalue of the form aggregate.member
or pointerToAggregate->member
would not violate 6.5p7. A compiler which couldn't handle at least some cases as defined would be of such low quality as to be pretty useless, but the Standard makes no effort to forbid conforming-but-useless implementations.
If a common initial sequence member has a character type, then 6.5p7 would define the behavior of accessing it, regardless of whether it is a member of a common initial sequence of a union whose complete declaration is visible. If it doesn't have a character type, then access would only be defined under 6.5p7 if performed through an lvalue of character type or memcpy
/memmove
, or in cases where the destination has heap duration and the ultimate type used for a read matches the type used for a write.
There are a number of indications a quality compiler should recognize that would suggest a pointer to one structure type might be used to access a CIS member of another. A compiler that is unable to recognize any of the other indications might benefit from treating the existence of a complete union declaration containing both types as such an indication. Doing so might needlessly block some otherwise-useful optimizations, but would still allow more optimizations than disabling type-based aliasing analysis altogether.