I am trying to understand how COM specifies the layout of its objects so that a client that wants to use a COM object knows how to do it.
I've read that a COM object that implements multiple interfaces can do it it in different ways including using nested classes or multiple inheritance.
My understanding is that both techniques would have to produce the same memory layout (conforming to the COM spec) so that a client that wants to use the COM object (for example in C), knows how to do it.
So my specific question is: is there a difference in memory layout for c++ objects implemented using multiple inheritance versus nested classes.
And could somebody point me to where a COM object layout is specified?
COM is completely agnostic of the memory layout of your object. All that it wants and needs is a table of function pointers when it calls IUnknown::QueryInterface()
. How you implement it is completely up to you. MFC uses nested classes, just about anything else leverages the built-in support for multiple inheritance in the C++ compiler. The way the MSVC++ compiler implements it is completely compatible with what COM needs. This is no accident. Use the boilerplate code you see listed in books about COM that shows how to properly implement IUnknown.
The only "layout" specified in COM is the vtable (virtual function pointer table) associated with each interface. Every interface derives from IUnknown, so that whatever interface of the object a client has a pointer to, he can call QueryInterface to obtain a different interface on the same object.
There is no mandated layout for objects. Indeed, the whole idea of an object in COM is very different from a class instance in an OO language: the only way to know if two interfaces are exposed by the same COM object is to call QueryInterface for the IUnknown interface on both of them - if and only if they return the same interface pointer, they are interfaces to the same object.
This is quite a flexible idea:
- it is possible, for example, to have COM objects with only part of their internal state loaded into memory: other parts of their state may be lazy loaded/allocated as further interfaces are requested.
- A COM object's state may be spread across several non-contiguous memory regions.
I don't believe a single COM interface can have multiple inheritance, but a class may implement multiple interfaces through multiple inheritance. So the multiple inheritance layout is irrelevant - each interface will have a unique layout, and it is up to the compiler to provide a pointer to the proper layout.
For single inheritance the compiler will place the parent class definitions at the front, followed by the child class. This is defined by the standard for data elements, but again this is irrelevant since interfaces don't have data. The standard says nothing about the existance or layout of vtables, but for polymorphism to work it has to be laid out the same way - parent first, child second.
You will discover a surprising fact if you implement multiple interfaces through multiple inheritance. As you cast a pointer to your class object from one interface to another, the address will change! This is because the different interfaces (vtables) must match the interface declaration, so there must be different layouts. These layouts are all contained within the same object, but the compiler does pointer manipulations when casting to get it to the proper subset.
If there are virtual functions involved in the mix, in particular if the most derived class adds any of its own, then the memory layout of both approaches will differ.