If anybody answers my question, please don't tell me to use C++.
So, I'm making a small library in C that uses an object-oriented approach. I chose to use the less-common of the two main approaches to inheritance in C: copying the members of the base type to the beginning of the derived type. Something like this:
struct base {
int a;
int b;
char c;
};
struct derived {
int a;
int b;
char c;
unsigned int d;
void (*virtual_method)(int, char);
};
This approach is less popular than the other one (an instance of the base type as the first member of the derived type) because
- technically, there is no standartized guarantee that the first common members of the base and derived structs will have the same offsets. However, with the exception of cases when one of the structs is packed and the other is not, they will have the same offsets on most, if not all, known compilers.
- this approach's most serious flaw: it violates strict aliasing. Casting a pointer to a derived struct to its base type and then dereferencing the pointer is technically undefined behaviour.
However, it also has its benefits compared to the other approach:
- Less verbosity: accessing a member of a derived struct that has been inherited is the same as accessing one that has not been inherited, instead of casting to the base type and then accessing the needed member;
- This is actually real inheritance and not composition;
- It is as easy to implement as the other approach, although a little preprocessor abuse may be needed;
- We can get a half-baked form of actual multiple inheritance, where we can inherit from several base types, but can cast to only one of them.
I have been looking into possibilities for making my library compile and work correctly with compilers that enforce strict aliasing (like gcc) without the user needing to turn it off manually. Here are the possibilities that I've looked into:
Unions. These are, sadly, a no-no for several reasons:
- Verbosity returns! To follow the standard's rules for accessing the first common members of 2 structs via a union, one must (as of C99) explicitly use the union to access the first common members. We'd need special syntaxis to access members of each type in the union!
- Space. Consider an inheritance hierarchy. We have a type that we want to be able to cast to from each of its derived types. And we want to do it for every type. The only feasible union-employing solution I see is a union of the entire hierarchy that would have to be used to convert instances of a derived type to a base type. And it would have to be just as large as the most derived type in the entire hierarchy...
Using
memcpy
instead of direct dereferencing (like here). That looks like a nice solution. However, the function call incurs an overhead, and yes, once again, verbosity. As I understand, whatmemcpy
does can also be done manually by casting a pointer to a struct to a pointer tochar
and then dereferencing it, something like this:(member_type)(*((char*)(&struct_pointer->member))) = new_value;
Gah, verbosity again. Well, this can be wrapped with a macro. But will that still work if we've casted our pointer to a pointer to an incompatible type, and then casted it tochar*
and dereferenced it? Like this:(member_type)(*((char*)(&((struct incompatible_type*)struct_pointer)->member))) = new_value;
Declaring all instances of types that we're going to cast as
volatile
. I wonder why this doesn't come up often.volatile
is, as I understand, used to tell the compiler that the memory pointed to by a pointer may change unexpectedly, thus cancelling optimizations based on the assumption that a segment of pointed-to memory is not going to change, which is the cause of all strict-aliasing problems. This is, of course, still undefined behaviour; but can't it be a feasible cross-platform solution for "hackishly" disabling strict aliasing optimisations for certain instances of certain types?
Aside from the questions above, here's two more:
- Is something I said above erroneous?
- Have I missed something that could help in my case?
I don't think your idea about casting via
char*
is valid. The rule is:A sub-expression of your expression is compatible but the overall expression isn't compatible.
I think the only realistic approach is composition:
I realize that's an intellectually unappealing way to achieve inheritance.
PS: I haven't put your virtual member function pointer in my derived class. It needs to be accessible from
base
so needs to be declared there (assuming it's a polymorphic function that exists for bothbase
andderived
). I've also added athis
parameter to flesh out the model a touch.memcpy
should be the way to go. Don't worry about function call overhead. Most often than not, there's none.memcpy
is usually a compiler intrinsic, which means the compiler should inline the most efficient possible code for it, and it should know where it can optimize memcpies out.Don't cast pointers to incompatible pointers and then dereference. That's a road towards undefined behavior.
If you accept expression statements and gcc's
##__VA_ARGS__
, you could have aMC_base_method(BaseType,BaseMethod,Derived_ptr,...)
macro that calls aBaseMethod
withDerived_ptr
and...
correctly, as long as you can work with a copy of a struct as if it was the original (e.g., no pointers to the struct's own members).Here's an example with some additional OOP-supporting macro sugar:
I consider it the compilers job to optimize the memcpies out. However, if it doesn't and your structs are huge, you're screwed. Same if your structs contain pointers to their own members (i.e., if you can't work with a byte per byte copy as if it was the original).