I'm reading this article "Virtual method table"
Example in the above article:
class B1 {
public:
void f0() {}
virtual void f1() {}
int int_in_b1;
};
class B2 {
public:
virtual void f2() {}
int int_in_b2;
};
class D : public B1, public B2 {
public:
void d() {}
void f2() {} // override B2::f2()
int int_in_d;
};
B2 *b2 = new B2();
D *d = new D();
In the article, the author introduces that the memory layout of object d
is like this:
d:
D* d--> +0: pointer to virtual method table of D (for B1)
+4: value of int_in_b1
B2* b2--> +8: pointer to virtual method table of D (for B2)
+12: value of int_in_b2
+16: value of int_in_d
Total size: 20 Bytes.
virtual method table of D (for B1):
+0: B1::f1() // B1::f1() is not overridden
virtual method table of D (for B2):
+0: D::f2() // B2::f2() is overridden by D::f2()
The question is about d->f2()
. The call to d->f2()
passes a B2
pointer as a this
pointer so we have to do something like:
(*(*(d[+8]/*pointer to virtual method table of D (for B2)*/)[0]))(d+8) /* Call d->f2() */
Why should we pass a B2
pointer as the this
pointer not the original D
pointer??? We are actually calling D::f2(). Based on my understanding, we should pass a D
pointer as this
to D::f2() function.
___update____
If passing a B2
pointer as this
to D::f2(), What if we want to access the members of B1
class in D::f2()?? I believe the B2
pointer (this) is shown like this:
d:
D* d--> +0: pointer to virtual method table of D (for B1)
+4: value of int_in_b1
B2* b2--> +8: pointer to virtual method table of D (for B2)
+12: value of int_in_b2
+16: value of int_in_d
It already has a certain offset of the beginning address of this contiguous memory layout. For example, we want to access b1
inside D::f2(), I guess in runtime, it will do something like: *(this+4)
(this
points to the same address as b2) which would points b2
in B
????
We cannot pass the D
pointer to a virtual function overriding B2::f2()
, because all overrides of the same virtual function must accept identical memory layout.
Since B2::f2()
function expects B2
's memory layout of the object being passed to it as its this
pointer, i.e.
b2:
+0: pointer to virtual method table of B2
+4: value of int_in_b2
the overriding function D::f2()
must expect the same layout as well. Otherwise, the functions would no longer be interchangeable.
To see why interchangeability matters consider this scenario:
class B2 {
public:
void test() { f2(); }
virtual void f2() {}
int int_in_b2;
};
...
B2 b2;
b2.test(); // Scenario 1
D d;
d.test(); // Scenario 2
B2::test()
needs to make a call of f2()
in both scenarios. It has no additional information to tell it how this
pointer has to be adjusted when making these calls*. That is why the compiler passes the fixed-up pointer, so test()
's call of f2
would work both with D::f2()
and B2::f2()
.
* Other implementations may very well pass this information; however, multiple inheritance implementation discussed in the article does not do it.
Given your class hierarchy, an object of type B2
will have the following memory footprint.
+------------------------+
| pointer for B2 vtable |
+------------------------+
| int_in_b2 |
+------------------------+
An object of type D
will have the following memory footprint.
+------------------------+
| pointer for B1 vtable |
+------------------------+
| int_in_b1 |
+------------------------+
| pointer for B2 vtable |
+------------------------+
| int_in_b2 |
+------------------------+
| int_in_d |
+------------------------+
When you use:
D* d = new D();
d->f2();
That call is the same as:
B2* b = new D();
b->f2();
f2()
can be called using a pointer of type B2
or pointer of type D
. Given that the runtime must be able to correctly work with a pointer of type B2
, it has to be able to correctly dispatch the call to D::f2()
using the appropriate function pointer in B2
's vtable. However, when the call is dispatched to D:f2()
the original pointer of type B2
must somehow be offset properly so that in D::f2()
, this
points to a D
, not a B2
.
Here's your example code, altered a little bit to print useful pointer values and member data to help understand the changes to the value of this
in various functions.
#include <iostream>
struct B1
{
void f0() {}
virtual void f1() {}
int int_in_b1;
};
struct B2
{
B2() : int_in_b2(20) {}
void test_f2()
{
std::cout << "In B::test_f2(), B*: " << (void*)this << std::endl;
this->f2();
}
virtual void f2()
{
std::cout
<< "In B::f2(), B*: " << (void*)this
<< ", int_in_b2: " << int_in_b2 << std::endl;
}
int int_in_b2;
};
struct D : B1, B2
{
D() : int_in_d(30) {}
void d() {}
void f2()
{
// ======================================================
// If "this" is not adjusted properly to point to the D
// object, accessing int_in_d will lead to undefined
// behavior.
// ======================================================
std::cout
<< "In D::f2(), D*: " << (void*)this
<< ", int_in_d: " << int_in_d << std::endl;
}
int int_in_d;
};
int main()
{
std::cout << "sizeof(void*) : " << sizeof(void*) << std::endl;
std::cout << "sizeof(int) : " << sizeof(int) << std::endl;
std::cout << "sizeof(B1) : " << sizeof(B1) << std::endl;
std::cout << "sizeof(B2) : " << sizeof(B2) << std::endl;
std::cout << "sizeof(D) : " << sizeof(D) << std::endl << std::endl;
B2 *b2 = new B2();
D *d = new D();
b2->test_f2();
d->test_f2();
return 0;
}
Output of the program:
sizeof(void*) : 8
sizeof(int) : 4
sizeof(B1) : 16
sizeof(B2) : 16
sizeof(D) : 32
In B::test_f2(), B*: 0x1f50010
In B::f2(), B*: 0x1f50010, int_in_b2: 20
In B::test_f2(), B*: 0x1f50040
In D::f2(), D*: 0x1f50030, int_in_d: 30
When the actual object used to call test_f2()
is D
, the value of this
changes from 0x1f50040
in test_f2()
to 0x1f50030
in D::f2()
. That matches with sizeof B1
, B2
, and D
. The offset of B2
sub-object of a D
object is 16 (0x10)
. The value of this
in B::test_f2()
, a B*
, is changed by 0x10
before the call is dispatched to D::f2()
.
I am going to guess that the value of the offset from D
to B2
is stored in B2
's vtable. Otherwise, there is no way a generic function dispatch mechanism can change the value of this
properly before dispatching the call to the right virtual function.