A followup to this post. Consider the following:
class C;
C foo();
That is a pair of valid declarations. C
doesn't need to be fully defined when merely declaring a function. But if we were to add the following function:
class C;
C foo();
inline C bar() { return foo(); }
Then suddenly C
needs to be a fully defined type. But with guaranteed copy elision, none of its members are required. There's no copying or even a move, the value is initialized elsewhere, and destroyed only in the context of the caller (to bar
).
So why? What in the standard prohibits it?
Despite the number of answers and amount of commentary posted to this thread (which has answered all of my personal questions), I have decided to post an answer 'for the rest of us'. I didn't initially understand what the OP was getting at but now I do, so I thought I'd share. If you know all this and it bores you, dear reader, then please just move on.
@xskxzr and @hvd effectively answered the question, but @hvd's post especially is in standardese and assumes that readers know how return-by-value (and by extension RVO) works, which I imagine not everybody does. I thought I did, but I was missing an important detail (which, when you think it through, is actually pretty obvious, but still, I missed it).
So this post mainly focuses on that, so that we can all see why (a) the OP wondered why there was an issue compiling
bar()
in the first place, and then (b) subsequently realised the reason.So, let's look at that code again. Given this (which is legal, even with an incompletely defined type):
Why can't the compiler compile this (I have removed the
inline
because it is irrelevant):The error message from gcc being:
Well, first up, the accepted answer quotes the relevant paragraph from the standard that explicitly forbids it, so no mystery there. But the OP (and indeed commenter Walter, who picked up on this straightaway) wanted to know why.
Well at first that seemed to me to be obvious: space needs to be allocated for the function result by the caller and it doesn't know how big the object is so the compiler is in a quandry. But I was missing a trick, and that lies in the way return-by-value works.
Now for those that don't know, returning class objects by value works by the caller allocating space for the returned object on the stack and passing a pointer to it as a hidden parameter to the function being called, which then constructs the object, manipulates it, whatever.
However, this daisy-chains, so if we have the following code (where we fully define
C
before callingbar()
):then space for
c
is allocated beforebar()
is called and the address ofc
is then passed as a hidden parameter tobar()
, and then passed directly on tofoo()
, which finally fills constructs the object in the desired location. So, becausebar()
didn't actually do anything with this pointer (apart from pass it around) then all it cares about is the pointer itself, and not what it points to.Or does it? Well, actually, yes, it does.
When returning a class object by value, small objects are usually returned in a register (or a pair of registers) as an optimisation. The compiler can get away with doing this in the majority of cases where the object is small enough (more on that in a moment).
But now,
bar()
needs to know whether this is whatfoo()
is going to do, and to do that it needs, for various reasons, to see the full declaration of the class.So, in summary, that's why the compiler needs a fully-defined type in order to call
foo()
, else it won't know whatfoo
() will be expecting and so it doesn't know what code to generate. Not on most platforms anyway, end of story.Notes:
I looked at gcc and there are seem to be two (entirely logical) rules for determining whether a class object is returned in a register or pair of registers:
std::is_trivially_copyable<T>::value
evaluates totrue
(maybe someone can find something in the standard about that).In case any readers don't know, RVO relies on constructing the object in its final resting place (i.e. in the location allocated by the caller). This is because there are objects (such as some implementations of
std::basic_string
, I believe) that are sensitive to being moved around in memory so you can't just construct them somewhere convenient to you and thenmemcpy
them somewhere else.If constructing the returned object in that final location is not possible (because of the way you coded the function returning the object), then RVO doesn't happen (how can it?), see live demo below (
make_no_RVO()
).As a specific example of point 1b, if a small object contains data members that (might) point either to itself or to any of its data members, then returning it by value will get you into trouble if you don't declare it properly. Just adding an empty copy constructor will do, since then it is no longer trivially copyable. But then I guess that's true in general, don't hide important information from the compiler.
Live demo here. All comments on this post welcome, I will answer them to the best of my ability.
This has nothing to do with copy ellision. the
foo
is supposed to return aC
value. As long as you just pass a reference or pointer tofoo
, it's OK. Once you try to callfoo
- as is the case inbar
- the size of its arguments and return value must be at hand; the only valid way to know that is presenting a full declaration of the requiered type. Had the signature been using a reference or a pointer, all the required info was present and you could do without the full type declaration. This approach has a name: pimpl==Pointer to IMPLementaion, and it is widely used as a means of hiding details in closed-source library distros.Guaranteed copy elision has exceptions, for compatibility reasons and/or efficiency. Trivially copyable types may be copied even where copy elision would otherwise be guaranteed. You're right that if this doesn't apply, then the compiler would be able to generate correct code without knowing any details of
C
, not even its size. But the compiler does need to know if this applies, and for that, it still needs the type to be complete.According to https://timsong-cpp.github.io/cppwp/class.temporary :
The rule lies in [basic.lval]/9: