Pure virtual functions may not have an inline defi

2020-01-25 05:37发布

问题:

Pure virtual functions are those member functions that are virtual and have the pure-specifier ( = 0; )

Clause 10.4 paragraph 2 of C++03 tells us what an abstract class is and, as a side note, the following:

[Note: a function declaration cannot provide both a pure-specifier and a definition —end note] [Example:

struct C {
virtual void f() = 0 { }; // ill-formed
};

—end example]

For those who are not very familiar with the issue, please note that pure virtual functions can have definitions but the above-mentioned clause forbids such definitions to appear inline (lexically in-class). (For uses of defining pure virtual functions you may see, for example, this GotW)

Now for all other kinds and types of functions it is allowed to provide an in-class definition, and this restriction seems at first glance absolutely artificial and inexplicable. Come to think of it, it seems such on second and subsequent glances :) But I believe the restriction wouldn't be there if there weren't a specific reason for that.

My question is: does anybody know those specific reasons? Good guesses are also welcome.

Notes:

  • MSVC does allow PVF's to have inline definitions. So don't get surprised :)
  • the word inline in this question does not refer to the inline keyword. It is supposed to mean lexically in-class

回答1:

In the SO thread “Why pure virtual function is initialized by 0?” Jerry Coffin provided this quote from Bjarne Stroustrup’s The Design & Evolution of C++, section §13.2.3, where I've added some emphasis of the part I think is relevant:

The curious =0 syntax was chosen over the obvious alternative of introducing a new keyword pure or abstract because at the time I saw no chance of getting a new keyword accepted. Had I suggested pure, Release 2.0 would have shipped without abstract classes. Given a choice between a nicer syntax and abstract classes, I chose abstract classes. Rather than risking delay and incurring the certain fights over pure, I used the tradition C and C++ convention of using 0 to represent "not there." The =0 syntax fits with my view that a function body is the initializer for a function and also with the (simplistic, but usually adequate) view of the set of virtual functions being implemented as a vector of function pointers. [ … ]

So, when choosing the syntax Bjarne was thinking of a function body as a kind of initializer part of the declarator, and =0 as an alternate form of initializer, one that indicated “no body” (or in his words, “not there”).

It stands to reason that one cannot both indicate “not there” and have a body – in that conceptual picture.

Or, still in that conceptual picture, having two initializers.

Now, that's as far as my telepathic powers, google-foo and soft-reasoning goes. I surmise that nobody's been Interested Enough™ to formulate a proposal to the committee about having this purely syntactical restriction lifted, and following up with all the work that that entails. Thus it's still that way.



回答2:

You shouldn't have so much faith in the standardization committee. Not everything has a deep reason to explain it. Something are so just because at first nobody thought otherwise and after nobody thought that changing it is important enough (I think it is the case here); for things old enough it could even be an artifact of the first implementation. Some are the result of evolution -- there was a deep reason at a time, but the reason was removed and the initial decision wasn't reconsidered again (it could be also the case here, where the initial decision was because any definition of the pure function was forbidden). Some are the result of negotiation between different POV and the result lacks coherence but this lack was deemed necessary to reach to consensus.



回答3:

Good guesses... well, considering the situation:

  • it is legal to declare the function inline and provide an explicitly inline body (outside the class), so there's clearly no objection to the only practical implication of being declared inside the class.
  • I see no potential ambiguities or conflicts introduced in the grammar, so no logical reason for the exclusion of function definitions in situ.

My guess: the use for bodies for pure virtual functions was realised after the = 0 | { ... } grammar was formulated, and the grammar simply wasn't revised. It's worth considering that there are a lot of proposals for language changes / enhancements - including those to make things like this more logical and consistent - but the number that are picked up by someone and written up as formal proposals is much smaller, and the number of those the Committee has time to consider, and believes the compiler-vendors will be prepared to implement, is much smaller again. Things like this need a champion, and perhaps you're the first person to see an issue in it. To get a feel for this process, check out http://www2.research.att.com/~bs/evol-issues.html.



回答4:

Good guesses are welcome you say?

I think the = 0 at the declaration comes from having the implementation in mind. Most likely this definition means, that you get a NULL entry in the RTTI's vtbl of the class information -- the location where at runtime addresses of the member functions of a class are stored.

But actually, when put a definition of the function in your *.cpp file, you introduce a name into the object file for the linker: An address in the *.o file where to find a specific function.

The basic linker then does need to know about C++ anymore. It can just link together, even though you declared it as = 0.

I think I read that it is possible what you described, although I forgot the behaviour :-)...



回答5:

Leaving destructors aside, implementations of pure virtual functions are a strange thing, because they never get called in the natural way. i.e. if you have a pointer or reference to your Base class the underlying object will always be some Derived that overrides the function, and that will always get called.

The only way to actually get the implementation to be called is using the Base::func() syntax from one of the derived class's overloads.

This actually, in some ways, makes it a better target for inlining, as at the point where the compiler wants to invoke it, it is always clear which overload is being called.

Also, if implementations for pure virtual functions were forbidden, there would be an obvious workaround of some other (probably protected) non-virtual function in the Base class that you could just call in the regular way from your derived function. Of course the scope would be less limited in that you could call it from any function.

(By the way, I am under the assumption that Base::f() can only be called with this syntax from Derived::f() and not from Derived::anyOtherFunc(). Am I right with this assumption?).

Pure virtual destructors are a different story, in a sense. It is used as a technique simply to prevent someone creating an instance of the derived class without there being any pure virtual functions elsewhere.

The answer to the actual question of "why" it is not permitted is really just because the standards committee said so, but my answer sheds some light on what we are trying to achieve anyway.