I've got a question about strict-aliasing rules, unions and standard. Assume we have the following code:
#include <stdio.h>
union
{
int f1;
short f2;
} u = {0x1};
int * a = &u.f1;
short * b = &u.f2;
int main()
{
u.f1 = 1;
*a += 1;
u.f2 = 2;
*b *= 2;
printf( "%d %hd\n", *a, *b);
return 0;
}
Now let's look how it works:
$ gcc-5.1.0-x86_64 t.c -O3 -Wall && ./a.out
2 4
$ gcc-5.1.0-x86_64 t.c -O3 -Wall -fno-strict-aliasing && ./a.out
4 4
We can see that strict-aliasing breaks dependencies. Moreover it seems to be a correct code without breaking strict-aliasing rule.
- Does it turn out than in case of union fields an object laying at the address is compatible with all types of union members?
- If 1 is true what should compiler do with pointers to union members? Is it a problem in the standard, that allows such compiler behavior? If not - why?
- Generally speaking different behavior of the compiler with the correct code is inadmissible in any case. So it seems to be a compiler bug too (especially if taking address to union field will be inside functions, the SA does not breaks dependence).
The C99 Technical Corrigendum 3 is clarifying about the type-punning based on the union method by stating in the section 6.5.2.3:
See here from 1042 through 1044
The C standard says that aliasing via unions is explicitly permitted.
However check the following code:
The intent of the strict aliasing rule is that
a
andb
should be assumed to not alias. However you could callfunc(&u.f1, &u.f2);
.To resolve this dilemma, a common sense solution is to say that the 'bypass permit' that unions have to avoid the strict aliasing rule only applies to when the union members are accessed by name.
The Standard doesn't explicitly state this. It could be argued that "If the member used..." (6.5.2.3) actually is specifying that the 'bypass' only occurs when accessing the member by name, but it's not 100% clear.
However it is hard to come up with any alternative and self-consistent interpretation. One possible alternative interpretation goes along the lines that writing
func(&u.f1, &u.f2)
causes UB because overlapping objects were passed to a function that 'knows' it does not receive overlapping objects -- sort of like arestrict
violation.If we apply this first interpretation to your example, we would say that the
*a
in yourprintf
causes UB because the current object stored at that location is ashort
, and 6.5.2.3 doesn't kick in because we are not using the union member by name.I'd guess based on your posted results that gcc is using the same interpretation.
This has been discussed before here but I can't find the thread right now.