typecasting with virtual functions

2019-04-26 14:45发布

问题:

In the code below, pC == pA:

class A
{
};

class B : public A
{
public:
    int i;
};

class C : public B
{
public:
    char c;
};

int main()
{
    C* pC = new C;
    A* pA = (A*)pC;

    return 0;
}

But when I add a pure virtual function to B and implement it in C, pA != pC:

class A
{
};

class B : public A
{
public:
    int i;
    virtual void Func() = 0;
};

class C : public B
{
public:
    char c;
    void Func() {}
};

int main()
{
    C* pC = new C;
    A* pA = (A*)pC;

    return 0;
}

Why is pA not equal to pC in this case? Don't they both still point to the same "C" object in memory?

回答1:

You're seeing a different value for your pointer because the new virtual function is causing the injection of a vtable pointer into your object. VC++ is putting the vtable pointer at the beginning of the object (which is typical, but purely an internal detail).

Let's add a new field to A so that it's easier to explain.

class A {
public:
    int a;
};
// other classes unchanged

Now, in memory, your pA and A look something like this:

pA --> | a      |          0x0000004

Once you add B and C into the mix, you end up with this:

pC --> | vtable |          0x0000000
pA --> | a      |          0x0000004
       | i      |          0x0000008
       | c      |          0x000000C

As you can see, pA is pointing to the data after the vtable, because it doesn't know anything about the vtable or how to use it, or even that it's there. pC does know about the vtable, so it points directly to the table, which simplifies its use.



回答2:

A pointer to an object is convertible to a pointer to base object and vice versa, but the conversion doesn't have to be trivial. It's entirely possible, and often necessary, that the base pointer has a different value than the derived pointer. That's why you have a strong type system and conversions. If all pointers were the same, you wouldn't need either.



回答3:

Here are my assumptions, based on the question.

1) You have a case where you cast from a C to an A and you get the expected behaviour.
2) You added a virtual function, and that cast no longer works (in that you can no longer pull data from A directly after the cast to A, you get data that makes no sense to you).

If these assumptions are true the hardship you are experiencing is the insertion of the virtual table in B. This means the data in the class is no longer perfectly lined up with the data in the base class (as in the class has added bytes, the virtual table, that are hidden from you). A fun test would be to check sizeof to observe the growth of unknown bytes.

To resolve this you should not cast directly from A to C to harvest data. You should add a getter function that is in A and inherited by B and C.

Given your update in the comments, I think you should read this, it explains virtual tables and the memory layout, and how it is compiler dependent. That link explains, in more detail, what I explained above, but gives examples of the pointers being different values. Really, I had WHY you were asking the question wrong, but it seems the information is still what you wanted. The cast from C to A takes into account the virtual table at this point (note C-8 is 4, which on a 32 bit system would be the size of the address needed for the virtual table, I believe).