I'm reading paragraph 7 of 6.5 in ISO/IEC 9899:TC2.
It condones lvalue access to an object through:
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
Please refer to the document for what 'aforementioned' types are but they certainly include the effective type of the object.
It is in a section noted as:
The intent of this list is to specify those circumstances in which an object may or may not be aliased.
I read this as saying (for example) that the following is well defined:
#include <stdlib.h>
#include <stdio.h>
typedef struct {
unsigned int x;
} s;
int main(void){
unsigned int array[3] = {73,74,75};
s* sp=(s*)&array;
sp->x=80;
printf("%d\n",array[0]);
return EXIT_SUCCESS;
}
This program should output 80.
I'm not advocating this as a good (or very useful) idea and concede I'm in part interpreting it that way because I can't think what else that means and can't believe it's a meaningless sentence!
That said, I can't see a very good reason to forbid it. What we do know is that the alignment and memory contents at that location are compatible with sp->x
so why not?
It seems to go so far as to say if I add (say) a double y;
to the end of the struct I can still access array[0]
through sp->x
in this way.
However even if the array is larger than sizeof(s)
any attempt to access sp->y
is 'all bets off' undefined behaviour.
Might I politely ask for people to say what that sentence condones rather than go into a flat spin shouting 'strict aliasing UB strict aliasing UB' as seems to be all too often the way of these things.
I confess that the idea that I can lay a
struct
over a locally defined array in this way is frankly exotic. I still maintain that C99 and all subsequent standards permit it. If fact it's very arguable that members being objects in themselves the first bullet point in 6.7.5 allows it:I think that's M.M's point.
Looking at the problem the other way, let's notice that it's absolutely legitimate (in a strictly conforming environment) to alias the member
sp->x
as an object in it's own right.In the context of the code in my OP consider a function with prototype
void doit(int* ip,s* sp);
the following call is expected to behave logically:NB: Program logic may (of course) may not behave as desired. For example if
doit
incrementssp->x
until it exceeds*ip
then there's a problem! However what is not allowed in a conformant compiler is for the outcome to be corrupted by artifacts due to the optimizer ignoring aliasing potential.I maintain that C would be all the weaker if the language required me to code:
Imagine all the cases where any call to any function has to be policed for the potential aliasing access to any part of the structures being passed. Such a language would probably be unusable.
Obviously a hard optimizing (i.e. non-compliant) compiler might make a complete hash of
doit()
if it doesn't recognize thatip
might be an alias of member in the middle ofsp
. That's irrelevant to this discussion.To set out when a compiler can (and cannot) make such assumptions is understood as the reason why the standard needs to set very precise parameters around aliasing. That is to give the optimizer some conditions to dis-count. In a low level language such as 'C' it could be reasonable (even desirable) to say that a suitably aligned pointer to an accessible valid bit pattern can be used to access to a value.
It is absolutely established that
sp->x
in my OP is pointing to a properly aligned location holding a validunsigned int
.The intelligent concerns are whether the compiler/optimizer agree that's then a legitimate way to access that location or ignorable as undefined behavior.
As the
doit()
example shows it's absolutely established that a structure can be broken down and treated as individual objects which merely happen to have a special relationship.This question appears to be about the circumstances when a set of members that happen to have that special relationship can have a structure 'laid over them'.
I think most people will agree that the program at the bottom of this answer performs valid, worthwhile functionality that if associated with some I/O library could 'abstract' a great deal of the work required to read and write structures. You might think there's a better way of doing it, but I'm not expecting many people to think it's not an unreasonable approach.
It operates by exactly that means - it builds a structure member by member then accesses it through that structure.
I suspect some of the people who object to the code in the OP are more relaxed about this. Firstly, it operates on memory allocated from the free-store as 'un-typed' universally aligned storage. Secondly, it builds a whole structure. In the OP I'm pointing the rules (at least appear to permit) that you can line up bits of a structure and so long as you only de-reference those bits everything is OK.
I somewhat share that attitude. I think the OP is slightly perverse and language stretching in a poorly written corner of the standard. Not something to put your shirt on.
However, I absolutely think it would be a mistake to forbid the techniques below as they rule out a logically very valid technique that recognizes structures can be built up from objects just as much as broken down into them.
However I will say that something like this is the only thing I could come up with where this sort of approach seems worthwhile. But on the other hand if you can't pull data apart AND/OR put it together then you quickly start to break the notion at C structures are POD - the possibly padded sum of their parts, nothing more, nothing less.
I think it's an interesting tension. C is intended to be that low level high level language and give the programmer almost direct access to machine operations and memory. That means the programmer can fulfill with the arbitrary demands of hardware devices and write highly efficient code. However if the programmer is given absolute control such as my point about an 'if it fits it's OK' approach to aliasing then the optimizer gets its game spoilt. So weirdly it's worth holding a little bit of performance back to return a dividend from the optimizer.
Section 6.5 of the C99 standard tries (and doesn't entirely succeed) to set that boundary out.
I think this text does not apply:
sp->x
has typeunsigned int
which is not an aggregate or union type.In your code there is no strict aliasing violation: it is OK to read
unsigned int
asunsigned int
.The struct might have different alignment requirements to the array but other than that there is no problem.
Accessing via "an aggregate or union type" would be:
The answer to this question is covered in proposal: Fixing the rules for type-based aliasing which we will see, unfortunately was not resolved in 2010 when the proposal was made which is covered in Hedquist, Bativa, November 2010 minutes . Therefore C11 does not contain a resolution to
N1520
, so this is an open issue:N1520
opens saying (emphasis mine going forward):And the relevant example that applies to this situation would be
3
which is as follows:The following discussion and proposed solutions are very long and hard to summarize but seems to end with a removal of the aforementioned bullet five and resolve the issue with adjustments to other parts of
6.5
. But as noted above this issues involved were not resolvable and I don't see a follow-up proposal.So it would seem as it standard the wording does seem to permit the scenario the OP demonstrates although my understanding is that this was unintentional and therefore I would avoid it and it could potentially change in later standards to be non-conforming.