Being asked to describe what a virtual function is seems to be one of the most common questions in interviews assessing basic C++ knowledge. However, after several years programming C++, I still have the uncomfortable feeling that I don't really understand how best to define what they are.
If I consult wikipedia, I see the definition of virtual function is:
"In object-oriented programming, a virtual function or virtual method is a function or method whose behaviour can be overridden within an inheriting class by a function with the same signature"
This definition seems simple and elegant, and not C++ specific. But to me it doesn't seem to capture the concept of a virtual function in C++, since surely a non-virtual function can also be overridden within an inheriting class by a function with the same signature.
If I'm asked to describe what a virtual function is informally, I say something about pointers like "it's a method such that when you call it via a base class pointer, the version of it defined in the derived class is called instead, if the pointer actually points to an instance of a derived class". This doesn't seem like a very elegant description of the concept. I know that people say this is how "polymorphism" is achieved in C++ (polymorphism, as far as I understand, roughly being the whole idea of organizing objects into hierarchies), but I don't know of a fancier way to understand or explain the mechanism than to go through the example with the pointers.
I guess I'm confused about whether the "pointer" description of virtual functions is something fundamental to their definition, or just something incidental to their implementation in C++.
The Idea
The basic difference between virtual and non virtual method is when you bind the name of the method to the actual method implementation. A virtual method is bound at runtime based on the type. A non virtual function is bound at compile time.
Slightly more formerly:
A virtual method is one that can be overridden in a derived type.
Such that when the virtual method is called (via pointer or reference) runtime binding is applied to select the version of the method that is defined in the most derived version; based on the type of the actual object (being pointed at or referenced).
C++ notes
Note: A method is virtual even if you do not use the virtual keyword if an ancestor declares a method with the same signature as virtual.
Let's say that Beta is a subclass of Alpha, and it can either create a new method area() or it can add a virtual one.
There isn't a difference if you are talking to a Beta pointer. But if you are talking to a Alpha* which points at a Beta object, you will get Alpha's method. Unless you declare the function as virtual.
The Beta subclass has its function dispatch table, which is a copy of Alpha's table but with extra Beta methods at the end. If Beta merely overrides a method, it will go in Beta's section of the dispatch table, so references to Alpha* will not see the new method. But, if the new method is virtual, it will go in the Alpha section of Beta's class table.
More to the point, let's say you have a Circle subclass of Shape. And let's say you have a pointer to a Shape object, x, which happens to be an instance of Circle. x->area() will only see the portion of the function table that relates to Shape. If Circle does a virtual area function, it will appear in the Shape section of the table. If Circle merely overrides area, then the area method will be placed in the Circle portion of the table and the Shape * x will not see the new function.
In C++ there is only one function table per class. This is a bit confusing for people who are used to scripting language where each object has it's own dispatch table. Scripting languages are extremely inefficient in that way. Imagine all the space taken up for each object.
A virtual function is one where the callee decides the behaviour. A non-virtual function is one where the caller decides the behaviour.
Is that concise enough?
Although pointers are necessary for using virtual functions in C++, I would argue that pointers are incidental to the idea, and explanations that don't depend on pointers are clearer. I think that templatetypedef and Charles_Bretana hit the main idea.
The reason that pointers seem to creep in to descriptions of virtual functions is that only pointers and references can have different run-time vs. compile-time types. A variable whose type is
class Foo
must contain aFoo
at run-time, so it doesn't matter whether a virtual function is used. But a variable whose type isclass Foo *
can point to any subclass of Foo, so this is the situation where virtual functions and non-virtual functions behave differently.Compiler aided mechanism of achieving dynamic polymorphism. Remember that C++ also supports static polymorphism via generic programming. And function pointers have always been there as the most primitive means of implementing dynamic polymorphism.