virtual method table for multiple-inheritance

2019-02-17 03:05发布

问题:

I'm reading this article "Virtual method table"

Example in the above article:

class B1 {
public:
  void f0() {}
  virtual void f1() {}
  int int_in_b1;
};

class B2 {
public:
  virtual void f2() {}
  int int_in_b2;
};

class D : public B1, public B2 {
public:
  void d() {}
  void f2() {}  // override B2::f2()
  int int_in_d;
};

B2 *b2 = new B2();
D  *d  = new D();

In the article, the author introduces that the memory layout of object d is like this:

          d:
D* d-->      +0: pointer to virtual method table of D (for B1)
             +4: value of int_in_b1
B2* b2-->    +8: pointer to virtual method table of D (for B2)
             +12: value of int_in_b2
             +16: value of int_in_d

Total size: 20 Bytes.

virtual method table of D (for B1):
  +0: B1::f1()  // B1::f1() is not overridden

virtual method table of D (for B2):
  +0: D::f2()   // B2::f2() is overridden by D::f2()

The question is about d->f2(). The call to d->f2() passes a B2 pointer as a this pointer so we have to do something like:

(*(*(d[+8]/*pointer to virtual method table of D (for B2)*/)[0]))(d+8) /* Call d->f2() */

Why should we pass a B2 pointer as the this pointer not the original D pointer??? We are actually calling D::f2(). Based on my understanding, we should pass a D pointer as this to D::f2() function.

___update____

If passing a B2 pointer as this to D::f2(), What if we want to access the members of B1 class in D::f2()?? I believe the B2 pointer (this) is shown like this:

          d:
D* d-->      +0: pointer to virtual method table of D (for B1)
             +4: value of int_in_b1
B2* b2-->    +8: pointer to virtual method table of D (for B2)
             +12: value of int_in_b2
             +16: value of int_in_d

It already has a certain offset of the beginning address of this contiguous memory layout. For example, we want to access b1 inside D::f2(), I guess in runtime, it will do something like: *(this+4) (this points to the same address as b2) which would points b2 in B????

回答1:

We cannot pass the D pointer to a virtual function overriding B2::f2(), because all overrides of the same virtual function must accept identical memory layout.

Since B2::f2() function expects B2's memory layout of the object being passed to it as its this pointer, i.e.

b2:
  +0: pointer to virtual method table of B2
  +4: value of int_in_b2

the overriding function D::f2() must expect the same layout as well. Otherwise, the functions would no longer be interchangeable.

To see why interchangeability matters consider this scenario:

class B2 {
public:
  void test() { f2(); }
  virtual void f2() {}
  int int_in_b2;
};
...
B2 b2;
b2.test(); // Scenario 1
D d;
d.test(); // Scenario 2

B2::test() needs to make a call of f2() in both scenarios. It has no additional information to tell it how this pointer has to be adjusted when making these calls*. That is why the compiler passes the fixed-up pointer, so test()'s call of f2 would work both with D::f2() and B2::f2().

* Other implementations may very well pass this information; however, multiple inheritance implementation discussed in the article does not do it.



回答2:

Given your class hierarchy, an object of type B2 will have the following memory footprint.

+------------------------+
| pointer for B2 vtable  |
+------------------------+
| int_in_b2              |
+------------------------+

An object of type D will have the following memory footprint.

+------------------------+
| pointer for B1 vtable  |
+------------------------+
| int_in_b1              |
+------------------------+
| pointer for B2 vtable  |
+------------------------+
| int_in_b2              |
+------------------------+
| int_in_d               |
+------------------------+

When you use:

D* d  = new D();
d->f2();

That call is the same as:

B2* b  = new D();
b->f2();

f2() can be called using a pointer of type B2 or pointer of type D. Given that the runtime must be able to correctly work with a pointer of type B2, it has to be able to correctly dispatch the call to D::f2() using the appropriate function pointer in B2's vtable. However, when the call is dispatched to D:f2() the original pointer of type B2 must somehow be offset properly so that in D::f2(), this points to a D, not a B2.

Here's your example code, altered a little bit to print useful pointer values and member data to help understand the changes to the value of this in various functions.

#include <iostream>

struct B1 
{
   void f0() {}
   virtual void f1() {}
   int int_in_b1;
};

struct B2 
{
   B2() : int_in_b2(20) {}
   void test_f2()
   {
      std::cout << "In B::test_f2(), B*: " << (void*)this << std::endl;
      this->f2();
   }

   virtual void f2()
   {
      std::cout
         << "In B::f2(), B*: " << (void*)this
         << ", int_in_b2: " << int_in_b2 << std::endl;
   }

   int int_in_b2;
};

struct D : B1, B2 
{
   D() : int_in_d(30) {}
   void d() {}
   void f2()
   {
      // ======================================================
      // If "this" is not adjusted properly to point to the D
      // object, accessing int_in_d will lead to undefined 
      // behavior.
      // ======================================================

      std::cout
         << "In D::f2(), D*: " << (void*)this
         << ", int_in_d: " << int_in_d << std::endl;
   }
   int int_in_d;
};

int main()
{
   std::cout << "sizeof(void*) : " << sizeof(void*) << std::endl;
   std::cout << "sizeof(int)   : " << sizeof(int) << std::endl;
   std::cout << "sizeof(B1)    : " << sizeof(B1) << std::endl;
   std::cout << "sizeof(B2)    : " << sizeof(B2) << std::endl;
   std::cout << "sizeof(D)     : " << sizeof(D) << std::endl << std::endl;

   B2 *b2 = new B2();
   D  *d  = new D();
   b2->test_f2();
   d->test_f2();
   return 0;
}

Output of the program:

sizeof(void*) : 8
sizeof(int)   : 4
sizeof(B1)    : 16
sizeof(B2)    : 16
sizeof(D)     : 32

In B::test_f2(), B*: 0x1f50010
In B::f2(), B*: 0x1f50010, int_in_b2: 20
In B::test_f2(), B*: 0x1f50040
In D::f2(), D*: 0x1f50030, int_in_d: 30

When the actual object used to call test_f2() is D, the value of this changes from 0x1f50040 in test_f2() to 0x1f50030 in D::f2(). That matches with sizeof B1, B2, and D. The offset of B2 sub-object of a D object is 16 (0x10). The value of this in B::test_f2(), a B*, is changed by 0x10 before the call is dispatched to D::f2().

I am going to guess that the value of the offset from D to B2 is stored in B2's vtable. Otherwise, there is no way a generic function dispatch mechanism can change the value of this properly before dispatching the call to the right virtual function.