Basically, I have a
struct foo {
/* variable denoting active member of union */
enum whichmember w;
union {
struct some_struct my_struct;
struct some_struct2 my_struct2;
struct some_struct3 my_struct3;
/* let's say that my_struct is the largest member */
};
};
main()
{
/*...*/
/* earlier in main, we get some struct foo d with an */
/* unknown union assignment; d.w is correct, however */
struct foo f;
f.my_struct = d.my_struct; /* mystruct isn't necessarily the */
/* active member, but is the biggest */
f.w = d.w;
/* code that determines which member is active through f.w */
/* ... */
/* we then access the *correct* member that we just found */
/* say, f.my_struct3 */
f.my_struct3.some_member_not_in_mystruct = /* something */;
}
Accessing C union members via pointers seems to say that accessing the members via pointers is okay. See comments.
But my question concerns directly accessing them. Basically, if I write all the information that I need to the largest member of the union and keep track of types manually, will accessing the manually specified member still yield the correct information every time?
I note that the code in the question uses an anonymous union, which means that it must be written for C11; anonymous unions were not a part of C90 or C99.
ISO/IEC 9899:2011, the current C11 standard, has this to say:
Italics as in the standard
And section §6.2.6 Representations of types says (in part):
My interpretation of what you're doing is that footnote 51 says "it might not work" because you may have assigned only part of the structure. You're treading on thin ice, at best. However, against that, you stipulate that the assigned structure (in the
f.my_struct = d.my_struct;
assignment) is the largest member. The chances are moderately high that it won't go wrong, but if the padding bytes in the two structures (in the active member of the union and in the largest member of the union) are at different places, then things could go wrong and if you reported a problem to the compiler writer, the compiler writer would simply say to you "don't contravene the standard".So, to the extent I'm a language lawyer, this language lawyer's answer is "It is not guaranteed". In practice, you're unlikely to run into problems, but the possibility is there and you have no comeback on anyone.
To make your code safe, simply use
f = d;
with a union assignment.Illustrative Example
Suppose that the machine requires
double
aligned on an 8-byte boundary andsizeof(double) == 8
, thatint
must be aligned on a 4-byte boundary andsizeof(int) == 4
, and thatshort
must be aligned on a 2-byte boundary andsizeof(short) == 2
). This is a plausible and even common set of sizes and alignment requirements.Further, suppose that you have a two-structure union variant of the structure in the question:
Now, under the sizes and alignments specified, the
struct Type_A
will occupy 16 bytes, andstruct Type_B
will occupy 8 bytes, so the union will use 16 bytes too. The layout of the union will be like this:The
w
element would also mean that there are 8 bytes instruct foo
before the (anonymous) union, of which it is likely thatw
only occupies 4. The size ofstruct foo
is therefore 24 on this machine. That's not particularly relevant to the discussion, though.Now suppose we have code like this:
Now, under the ruling of footnote 51, the structure assignment
f.s1 = d.s1;
does not have to copy the padding bits. I know of no compiler that behaves like this, but the standard says that a compiler need not copy the padding bits. That means that the value off.s1
could be:The garbage is because those 7 bytes need not have been copied (footnote 51 says that is an option, even though it is not likely to be an option exercised by any current compiler). The rubbish is because the initialization of
d
never set any values in those bytes; the contents of that part of the structure is unspecified.If you now go ahead and try to treat
f
as a copy ofd
, you might be a little surprised to find that only 1 byte of the 8 relevant bytes off.s2
is actually initialized.I'll reemphasize: I know of no compiler that would do this. But the question is tagged 'language lawyer' so the issue is 'what does the language standard state' and this is my interpretation of the quoted sections of the standard.
Yes you can directly access them. You can assign a value to a union member and read it back through a different union member. The result will be deterministic and correct.
Yes your code will work because with an union the compiler will share the same memory space for all the elements.
For example if: &f.mystruct = 100 then &f.mystruct2 = 100 and &f.mystruct3 = 100
If mystruct is the largest one then it will work all the time.