In the following code, it calls a virtual function foo via a pointer to a derived object. Will this call go through the vtable or will it call B::foo
directly?
If it goes via a vtable, what would be a C++ idiomatic way of making it call B::foo
directly? I know that in this case I am always pointing to a B
.
Class A
{
public:
virtual void foo() {}
};
class B : public A
{
public:
virtual void foo() {}
};
int main()
{
B* b = new B();
b->foo();
}
Yes, it will use the vtable (only non-virtual methods bypass the vtable). To call B::foo()
on b
directly, call b->B::foo()
.
Most compilers will be smart enough to eliminate the indirect call in that scenario, if you have optimization enabled. But only because you just created the object and the compiler knows the dynamic type; there may be situations when you know the dynamic type and the compiler doesn't.
As usual, the answer to this question is "if it is important to you, take a look at the emitted code". This is what g++ produces with no optimisations selected:
18 b->foo();
0x401375 <main+49>: mov eax,DWORD PTR [esp+28]
0x401379 <main+53>: mov eax,DWORD PTR [eax]
0x40137b <main+55>: mov edx,DWORD PTR [eax]
0x40137d <main+57>: mov eax,DWORD PTR [esp+28]
0x401381 <main+61>: mov DWORD PTR [esp],eax
0x401384 <main+64>: call edx
which is using the vtable. A direct call, produced by code like:
B b;
b.foo();
looks like this:
0x401392 <main+78>: lea eax,[esp+24]
0x401396 <main+82>: mov DWORD PTR [esp],eax
0x401399 <main+85>: call 0x40b2d4 <_ZN1B3fooEv>
This is the compiled code from g++ (4.5) with -O3
_ZN1B3fooEv:
rep
ret
main:
subq $8, %rsp
movl $8, %edi
call _Znwm
movq $_ZTV1B+16, (%rax)
movq %rax, %rdi
call *_ZTV1B+16(%rip)
xorl %eax, %eax
addq $8, %rsp
ret
_ZTV1B:
.quad 0
.quad _ZTI1B
.quad _ZN1B3fooEv
The only optimization it did was that it knew which vtable to use (on the b object). Otherwise "call *_ZTV1B+16(%rip)" would have been "movq (%rax), %rax; call *(%rax)".
So g++ is actually quite bad at optimizing virtual function calls.
Compiler can optimize away virtual dispatch and call virtual function directly or inline it if it can prove it's the same behavior. In the provided example, compiler will easily throw away every line of code, so all you'll get is this:
int main() {}
I changed the code up a bit to give it a go myself, and to me it looks like it's dropping the vtable, but I'm not expert enough in asm to tell. I'm sure some commentators will set me right though :)
struct A {
virtual int foo() { return 1; }
};
struct B : public A {
virtual int foo() { return 2; }
};
int useIt(A* a) {
return a->foo();
}
int main()
{
B* b = new B();
return useIt(b);
}
I then converted this code to assembly like this:
g++ -g -S -O0 -fverbose-asm virt.cpp
as -alhnd virt.s > virt.base.asm
g++ -g -S -O6 -fverbose-asm virt.cpp
as -alhnd virt.s > virt.opt.asm
And the interesting bits look to me like the 'opt' version is dropping the vtable. It looks like it's creating the vtable but not using it..
In the opt asm:
9:virt.cpp **** int useIt(A* a) {
89 .loc 1 9 0
90 .cfi_startproc
91 .LVL2:
10:virt.cpp **** return a->foo();
92 .loc 1 10 0
93 0000 488B07 movq (%rdi), %rax # a_1(D)->_vptr.A, a_1(D)->_vptr.A
94 0003 488B00 movq (%rax), %rax # *D.2259_2, *D.2259_2
95 0006 FFE0 jmp *%rax # *D.2259_2
96 .LVL3:
97 .cfi_endproc
and the base.asm version of the same:
9:virt.cpp **** int useIt(A* a) {
88 .loc 1 9 0
89 .cfi_startproc
90 0000 55 pushq %rbp #
91 .LCFI6:
92 .cfi_def_cfa_offset 16
93 .cfi_offset 6, -16
94 0001 4889E5 movq %rsp, %rbp #,
95 .LCFI7:
96 .cfi_def_cfa_register 6
97 0004 4883EC10 subq $16, %rsp #,
98 0008 48897DF8 movq %rdi, -8(%rbp) # a, a
10:virt.cpp **** return a->foo();
99 .loc 1 10 0
100 000c 488B45F8 movq -8(%rbp), %rax # a, tmp64
101 0010 488B00 movq (%rax), %rax # a_1(D)->_vptr.A, D.2263
102 0013 488B00 movq (%rax), %rax # *D.2263_2, D.2264
103 0016 488B55F8 movq -8(%rbp), %rdx # a, tmp65
104 001a 4889D7 movq %rdx, %rdi # tmp65,
105 001d FFD0 call *%rax # D.2264
11:virt.cpp **** }
106 .loc 1 11 0
107 001f C9 leave
108 .LCFI8:
109 .cfi_def_cfa 7, 8
110 0020 C3 ret
111 .cfi_endproc
On line 93 we see in the comments: _vptr.A
which I'm pretty sure means it's doing a vtable lookup, however, in the actual main function, it seems to be able to predict the answer and doesn't even call that useIt code:
16:virt.cpp **** return useIt(b);
17:virt.cpp **** }
124 .loc 1 17 0
125 0015 B8020000 movl $2, %eax #,
which I think is just saying, we know we're gonna return 2, lets just put it in eax. (I re ran the program asking it to return 200, and that line got updated as I would expect).
extra bit
So I complicated the program up a bit more:
struct A {
int valA;
A(int value) : valA(value) {}
virtual int foo() { return valA; }
};
struct B : public A {
int valB;
B(int value) : valB(value), A(0) {}
virtual int foo() { return valB; }
};
int useIt(A* a) {
return a->foo();
}
int main()
{
A* a = new A(100);
B* b = new B(200);
int valA = useIt(a);
int valB = useIt(a);
return valA + valB;
}
In this version, the useIt code definitely uses the vtable in the optimized assembly:
13:virt.cpp **** int useIt(A* a) {
89 .loc 1 13 0
90 .cfi_startproc
91 .LVL2:
14:virt.cpp **** return a->foo();
92 .loc 1 14 0
93 0000 488B07 movq (%rdi), %rax # a_1(D)->_vptr.A, a_1(D)->_vptr.A
94 0003 488B00 movq (%rax), %rax # *D.2274_2, *D.2274_2
95 0006 FFE0 jmp *%rax # *D.2274_2
96 .LVL3:
97 .cfi_endproc
This time, the main function inlines a copy of useIt
, but does actually do the vtable lookup.
What about c++11 and the 'final' keyword?
So I changed one line to:
virtual int foo() override final { return valB; }
and the compiler line to:
g++ -std=c++11 -g -S -O6 -fverbose-asm virt.cpp
Thinking that telling the compiler that it is a final override, would allow it to skip the vtable maybe.
Turns out it still uses the vtable.
So my theoretical answer would be:
- I don't think there are any explicit, "don't use the vtable" optimizations. (I searched through the g++ manpage for vtable and virt and the like and found nothing).
- But g++ with -O6, can do a lot of optimization on a simple program with obvious constants to the point where it can predict the result and skip the call altogether.
- However, once things get complex (read real) it's definitely doing vtable lookups, pretty much everytime you call a virtual function.