I'm writing a C library that uses some simple object-oriented inheritance much like this:
struct Base {
int x;
};
struct Derived {
struct Base base;
int y;
};
And now I want to pass a Derived* to a function that takes a Base* much like this:
int getx(struct Base *arg) {
return arg->x;
};
int main() {
struct Derived d;
return getx(&d);
};
This works, and is typesafe of course, but the compiler doesn't know this. Is there a way to tell the compiler that this is typesafe? I'm focusing just on GCC and clang here so compiler-specific answers are welcome. I have vague memories of seeing some code that did this using __attribute__((inherits(Base))
or something of the sort but my memory could be lying.
This is safe in C except that you should cast the argument to Base *
. The rule that prohibits aliasing (or, more precisely, that excludes it from being supported in standard C) is in C 2011 6.5, where paragraph 7 states:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
…
This rule prevents us from taking a pointer to float
, converting it to pointer to int
, and dereferencing the pointer to int
to access the float
as an int
. (More precisely, it does not prevent us from trying, but it makes the behavior undefined.)
It might seems that your code violates this since it accesses a Derived
object using a Base
lvalue. However, converting a pointer to Derived
to a pointer to Base
is supported by C 2011 6.7.2.1 paragraph 15 states:
… A pointer to a structure object, suitably converted, points to its initial member…
So, when we convert the pointer to Derived
to a pointer to Base
, what we actually have is not a pointer to the Derived
object using a different type than it is (which is prohibited) but a pointer to the first member of the Derived
object using its actual type, Base
, which is perfectly fine.
About the edit: Originally I stated function arguments would be converted to the parameter types. However, C 6.5.2.2 2 requires that each argument have a type that may be assigned to an object with the type of its corresponding parameter (with any qualifications like const
removed), and 6.5.16.1 requires that, when assigning one pointer to another, they have compatible types (or meet other conditions not applicable here). Thus, passing a pointer to Derived
to a function that takes a pointer to Base
violates standard C constraints. However, if you perform the conversion yourself, it is legal. If desired, the conversion could be built into a preprocessor macro that calls the function, so that the code still looks like a simple function call.
Give address of a base member (truly type-safe option):
getx(&d.base);
Or use void pointer:
int getx(void * arg) {
struct Base * temp = arg;
return temp->x;
};
int main() {
struct Derived d;
return getx(&d);
};
It works because C requires that there is never a padding before the first struct member. This won't increase type safety, but removes the needs for casting.
As noted above by user694733, you are probably best off to conform to standards and type safety by using the address of the base field as in (repeating for future reference)
struct Base{
int x;
}
struct Derived{
int y;
struct Base b; /* look mam, not the first field! */
}
struct Derived d = {0}, *pd = &d;
void getx (struct Base* b);
and now despite the base not being the first field you can still do
getx (&d.b);
or if you are dealing with a pointer
getx(&pd->b).
This is a very common idiom. You have to be careful if the pointer is NULL, however, because the &pd->b just does
(struct Base*)((char*)pd + offsetof(struct Derived, b))
so &((Derived*)NULL)->b becomes
((struct Base*)offsetof(struct Derived, b)) != NULL.
IMO it is a missed opportunity that C has adopted anonymous structs but not adopted the plan9 anonymous struct model which is
struct Derived{
int y;
struct Base; /* look mam, no fieldname */
} d;
It allows you to just write getx(&d) and the compiler will adjust the Derived pointer to a base pointer i.e. it means exactly the same as getx(&d.b) in the example above. In other words it effectively gives you inheritance but with a very concrete memory layout model. In particular, if you insist on not embedding (== inheriting) the base struct at the top, you have to deal with NULL yourself. As you expect from inheritance it works recursively so for
struct TwiceDerived{
struct Derived;
int z;
} td;
you can still write getx(&td). Moreover, you may not need the getx as you can write d.x (or td.x or pd->x).
Finally using the typeof gcc extension you can write a little macro for downcasting (i.e. casting to a more derived struct)
#define TO(T,p) \
({ \
typeof(p) nil = (T*)0; \
(T*)((char*)p - ((char*)nil - (char*)0)); \
}) \
so you can do things like
struct Base b = {0}, *pb = &b;
struct Derived* pd = TO(struct Derived, pb);
which is useful if you try to do virtual functions with function pointers.
On gcc you can use/experiment with the plan 9 extensions with -fplan9-extensions. Unfortunately it does not seem to have been implemented on clang.