Please consider the following code:
typedef struct {
int type;
} object_t;
typedef struct {
object_t object;
int age;
} person_t;
int age(object_t *object) {
if (object->type == PERSON) {
return ((person_t *)object)->age;
} else {
return 0;
}
}
Is this legal code or is it violating the C99 strict aliasing rule? Please explain why it is legal/illegal.
Strict aliasing rule is about two pointers of different types referencing the same location in memory (ISO/IEC9899/TC2). Although your example reinterprets the address of object_t object
as an address of person_t
, it does not reference memory location inside object_t
through the reinterpreted pointer, because age
is located past the boundary of object_t
. Since memory locations referenced through pointers are not the same, I'd say that it is not in violation of the strict aliasing rule. FWIW, gcc -fstrict-aliasing -Wstrict-aliasing=2 -O3 -std=c99
seems to agree with that assessment, and does not produce a warning.
This is not enough to decide that it's legal code, though: your example makes an assumption that the address of a nested structure is the same as that of its outer structure. Incidentally, this is a safe assumption to make according to the C99 standard:
6.7.2.1-13. A pointer to a structure object, suitably converted, points to its initial member
The two considerations above make me think that your code is legal.
The strict aliasing rule limits by what types you access an object (a region of memory). There are a few places in the code where the rule might crop up: within age()
and when calling age()
.
Within age
, you have object
to consider. ((person_t *)object)
is an lvalue expression because it has an object type and it designates an object (a region of memory). However, the branch is only reached if object->type == PERSON
, so (presumably) the effective type of the object is a person_t*
, hence the cast doesn't violate strict aliasing. In particular, strict aliasing allows:
- a type compatible with the effective type of the object,
When calling age()
, you will presumably be passing an object_t*
or a type that descends from object_t
: a struct that has an object_t
as the first member. This is allowed as:
- an aggregate or union type that includes one of the aforementioned types among its members
Furthermore, the point of strict aliasing is to allow for optimizing away loading values into registers. If an object is mutated via one pointer, anything pointed to by pointers of an incompatible type are assumed to remain unchanged, and thus don't need to be reloaded. The code doesn't modify anything, so shouldn't be affected by the optimization.
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
As an add-on to the accepted answer, here is the full citation from the standard, with the important part highlighted that the other answer omitted, and one more:
6.7.2.1-13: Within a structure object, the non-bit-field members and the units in
which bit-fields reside have addresses that increase in the order in
which they are declared. A pointer to a structure object, suitably
converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa.
There may be unnamed padding within a structure object, but not at its
beginning.
6.3.2.3-7: A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the
resulting pointer is not correctly aligned for the pointed-to type,
the behavior is undefined. Otherwise, when converted back again, the
result shall compare equal to the original pointer. [...]
I find your example to be a perfect place for a void pointer:
int age(void *object) {
Why? Because your obvious intention is to give different "object" types to such a function, and it gets the information according to the encoded type. In your version, you need a cast each time you call the function: age((object_t*)person);
. The compiler will not complain when you give the wrong pointer to it, so there is no type safety involved, anyway. Then you can as well use a void pointer and avoid the cast when calling the function.
Alternatively you could call the function with age(&person->object)
, of course. Each time you call it.
One acceptable way that is explicitly condoned by the standard is to make a union of structs with identical initial segment, like so:
struct tag { int value; };
struct obj1 { int tag; Foo x; Bar y; };
struct obj2 { int tag; Zoo z; Car w; };
typedef union object_
{
struct tag;
struct obj1;
struct obj2;
} object_t;
Now you can pass an object_t * p
and examine p->tag.value
with impunity, and then access the desired union member.