I stumbled across a code based on unions in C. Here is the code:
union {
struct {
char ax[2];
char ab[2];
} s;
struct {
int a;
int b;
} st;
} u ={12, 1};
printf("%d %d", u.st.a, u.st.b);
I just couldn't understand how come the output was 268 0
. How were the values initialized?
How is the union functioning here? Shouldn't the output be 12 1
. It would be great if anyone could explain what exactly is happening here in detail.
I am using a 32 bit processor and on Windows 7.
It probably assigned { 12 ,1 } to the first 2 char in s.ax.
So in a 32bit int it's 1*256 + 12 = 268
The code sets
u.s.ax[0]
to 12 andu.s.ax[1]
to 1.u.s.ax
is overlayed ontou.st.a
so the least-significant byte ofu.st.a
is set to 12 and the most-significant byte to 1 (so you must be running on a little-endian architecture) giving a value of 0x010C or 268.A union's size is the maximum size of the largest element that composes the union. So in this case, your union type has a size of 8-bytes on a 32-bit platform where
int
types are 4-bytes each. The first member of the union,s
, though, only takes up 2-bytes, and therefore overlaps with the first 2-bytes of thest.a
member. Since you are on a little-endian system, that means that we're overlapping the two lower-order bytes ofst.a
. Thus, when you initialize the union as it's done with the values{12, 1}
, you've only initialized the values in the two lower-order bytes ofst.a
... this leaves the value ofst.b
initialized to0
. Thus when you attempt to print out the struct containing the twoint
rather thanchar
members of the union, you end up with your results of128
and0
.Your code uses the default initializer for the union, which is its first member. Both 12 and 1 go into the characters of ax, hence the result that you see (which is very much compiler-dependent).
If you wanted to initialize through the second memmber (
st
) you would use a designated initializer:The code doesn't do what you think. Brace-initializes initialize the first union member, i.e.
u.s
. However, now the initializer is incomplete and missing braces, sinceu.s
contains two arrays. It should be somethink like:u = { { {'a', 'b'}, { 'c', 'd' } } };
You should always compile with all warnings, a decent compiler should have told you that something was amiss. For instance, GCC says,
missing braces around initialiser (near initialisation for ‘u.s’)
andmissing initialiser (near initialisation for ‘u.s.ab’)
. Very helpful.In C99 you can take advantage of named member initialization to initialize the second union member:
u = { .st = {12, 1} };
(This is not possible in C++, by the way.) The corresponding syntax for the first case is`u = { .s = { {'a', 'b'}, { 'c', 'd' } } };
, which is arguably more explicit and readable!