Pure virtual functions are those member functions that are virtual and have the pure-specifier ( = 0;
)
Clause 10.4 paragraph 2 of C++03 tells us what an abstract class is and, as a side note, the following:
[Note: a function declaration cannot provide both a pure-specifier and a definition —end note] [Example:
struct C {
virtual void f() = 0 { }; // ill-formed
};
—end example]
For those who are not very familiar with the issue, please note that pure virtual functions can have definitions but the above-mentioned clause forbids such definitions to appear inline (lexically in-class). (For uses of defining pure virtual functions you may see, for example, this GotW)
Now for all other kinds and types of functions it is allowed to provide an in-class definition, and this restriction seems at first glance absolutely artificial and inexplicable. Come to think of it, it seems such on second and subsequent glances :) But I believe the restriction wouldn't be there if there weren't a specific reason for that.
My question is: does anybody know those specific reasons? Good guesses are also welcome.
Notes:
- MSVC does allow PVF's to have inline definitions. So don't get surprised :)
- the word
inline
in this question does not refer to the inline keyword. It is supposed to mean lexically in-class
You shouldn't have so much faith in the standardization committee. Not everything has a deep reason to explain it. Something are so just because at first nobody thought otherwise and after nobody thought that changing it is important enough (I think it is the case here); for things old enough it could even be an artifact of the first implementation. Some are the result of evolution -- there was a deep reason at a time, but the reason was removed and the initial decision wasn't reconsidered again (it could be also the case here, where the initial decision was because any definition of the pure function was forbidden). Some are the result of negotiation between different POV and the result lacks coherence but this lack was deemed necessary to reach to consensus.
In the SO thread “Why pure virtual function is initialized by 0?” Jerry Coffin provided this quote from Bjarne Stroustrup’s The Design & Evolution of C++, section §13.2.3, where I've added some emphasis of the part I think is relevant:
So, when choosing the syntax Bjarne was thinking of a function body as a kind of initializer part of the declarator, and
=0
as an alternate form of initializer, one that indicated “no body” (or in his words, “not there”).It stands to reason that one cannot both indicate “not there” and have a body – in that conceptual picture.
Or, still in that conceptual picture, having two initializers.
Now, that's as far as my telepathic powers, google-foo and soft-reasoning goes. I surmise that nobody's been Interested Enough™ to formulate a proposal to the committee about having this purely syntactical restriction lifted, and following up with all the work that that entails. Thus it's still that way.
Good guesses... well, considering the situation:
My guess: the use for bodies for pure virtual functions was realised after the
= 0 | { ... }
grammar was formulated, and the grammar simply wasn't revised. It's worth considering that there are a lot of proposals for language changes / enhancements - including those to make things like this more logical and consistent - but the number that are picked up by someone and written up as formal proposals is much smaller, and the number of those the Committee has time to consider, and believes the compiler-vendors will be prepared to implement, is much smaller again. Things like this need a champion, and perhaps you're the first person to see an issue in it. To get a feel for this process, check out http://www2.research.att.com/~bs/evol-issues.html.Good guesses are welcome you say?
I think the
= 0
at the declaration comes from having the implementation in mind. Most likely this definition means, that you get aNULL
entry in the RTTI'svtbl
of the class information -- the location where at runtime addresses of the member functions of a class are stored.But actually, when put a definition of the function in your
*.cpp
file, you introduce a name into the object file for the linker: An address in the*.o
file where to find a specific function.The basic linker then does need to know about C++ anymore. It can just link together, even though you declared it as
= 0
.I think I read that it is possible what you described, although I forgot the behaviour :-)...
Leaving destructors aside, implementations of pure virtual functions are a strange thing, because they never get called in the natural way. i.e. if you have a pointer or reference to your Base class the underlying object will always be some Derived that overrides the function, and that will always get called.
The only way to actually get the implementation to be called is using the Base::func() syntax from one of the derived class's overloads.
This actually, in some ways, makes it a better target for inlining, as at the point where the compiler wants to invoke it, it is always clear which overload is being called.
Also, if implementations for pure virtual functions were forbidden, there would be an obvious workaround of some other (probably protected) non-virtual function in the Base class that you could just call in the regular way from your derived function. Of course the scope would be less limited in that you could call it from any function.
(By the way, I am under the assumption that
Base::f()
can only be called with this syntax fromDerived::f()
and not fromDerived::anyOtherFunc()
. Am I right with this assumption?).Pure virtual destructors are a different story, in a sense. It is used as a technique simply to prevent someone creating an instance of the derived class without there being any pure virtual functions elsewhere.
The answer to the actual question of "why" it is not permitted is really just because the standards committee said so, but my answer sheds some light on what we are trying to achieve anyway.