Does accessing union members via a pointer, as in the example below, result in undefined behavior in C99? The intent seems clear enough, but I know that there are some restrictions regarding aliasing and unions.
union { int i; char c; } u;
int *ip = &u.i;
char *ic = &u.c;
*ip = 0;
*ic = 'a';
printf("%c\n", u.c);
It is unspecified (subtly different from undefined) behaviour to access a union by any element other than the one that was last written. That's detailed in C99 annex J:
However, since you are writing to
c
via the pointer, then readingc
, this particular example is well defined. It does not matter how you write to the element:There is one issue that has been raised in comments which seems to contradict that, at least seemingly. User
davmac
provides sample code:which outputs different values on different compilers. However, I believe that this is because it is actually violating the rules here because it writes to member
f
then reads memberi
and, as shown in Annex J, that's unspecified.There is a footnote 82 in
6.5.2.3
which states:However, since this seems to go against the Annex J comment and it's a footnote to the section dealing with expressions of the form
x.y
, it may not apply to accesses via a pointer.One of the major reasons why aliasing is supposed to be strict is to allow the compiler more scope for optimisation. To that end, the standard dictates that treating memory of a different type to that written is unspecified.
By way of example, consider the function provided:
The implementation is free to assume that, because you're not supposed to alias memory,
up->i
and*fp
are two distinct objects. So it's free to assume that you're not changing the value ofup->i
after you set it to123
so it can simply return123
without looking at the actual variable contents again.If instead, you changed the pointer setting statement to:
then that would make footnote 82 applicable and the returned value would be a re-interpretation of the float as an integer.
The reason why I don't think that's an issue for the question is because your writing then reading the same type, hence aliasing rules don't come into play.
It's interesting to note that the unspecified behaviour is caused not by the function itself, but by calling it thus:
If you were instead to call it so:
that would not be a problem.
No, it won't but you need to keep track of what the last type you put into the union was. If I were to reverse the order of your
int
andchar
assignments it would be a very different story:EDIT: Some explanation on why it may have printed 64 ('@').
The binary representation of 123456 is 0001 1110 0010 0100 0000.
For 64 it is 0100 0000.
You can see that the first 8 bits are identical and since
printf
is instructed to read the first 8 bits, it prints only as much.The only reason it's not UB is because you were lucky/unlucky enough to choose
char
for one of the types, and character types can alias anything in C. If the types were, for example,int
andfloat
, the accesses via pointers would be aliasing violations and thus undefined behavior. For direct access via the union, the behavior was deemed well defined as part of the interpretation for Defect Report 283:http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm
Of course, you still need to ensure that the representation of the type used for writing can also be interpreted as a valid (non-trap) representation for the type later used for reading.