With guaranteed copy elision, why does the class n

2019-03-11 03:14发布

A followup to this post. Consider the following:

class C;
C foo();

That is a pair of valid declarations. C doesn't need to be fully defined when merely declaring a function. But if we were to add the following function:

class C;
C foo();
inline C bar() { return foo(); }

Then suddenly C needs to be a fully defined type. But with guaranteed copy elision, none of its members are required. There's no copying or even a move, the value is initialized elsewhere, and destroyed only in the context of the caller (to bar).

So why? What in the standard prohibits it?

4条回答
对你真心纯属浪费
2楼-- · 2019-03-11 03:37

Despite the number of answers and amount of commentary posted to this thread (which has answered all of my personal questions), I have decided to post an answer 'for the rest of us'. I didn't initially understand what the OP was getting at but now I do, so I thought I'd share. If you know all this and it bores you, dear reader, then please just move on.

@xskxzr and @hvd effectively answered the question, but @hvd's post especially is in standardese and assumes that readers know how return-by-value (and by extension RVO) works, which I imagine not everybody does. I thought I did, but I was missing an important detail (which, when you think it through, is actually pretty obvious, but still, I missed it).

So this post mainly focuses on that, so that we can all see why (a) the OP wondered why there was an issue compiling bar() in the first place, and then (b) subsequently realised the reason.

So, let's look at that code again. Given this (which is legal, even with an incompletely defined type):

class C;
C foo();

Why can't the compiler compile this (I have removed the inline because it is irrelevant):

C bar() { return foo(); }

The error message from gcc being:

error: return type 'class C' is incomplete

Well, first up, the accepted answer quotes the relevant paragraph from the standard that explicitly forbids it, so no mystery there. But the OP (and indeed commenter Walter, who picked up on this straightaway) wanted to know why.

Well at first that seemed to me to be obvious: space needs to be allocated for the function result by the caller and it doesn't know how big the object is so the compiler is in a quandry. But I was missing a trick, and that lies in the way return-by-value works.

Now for those that don't know, returning class objects by value works by the caller allocating space for the returned object on the stack and passing a pointer to it as a hidden parameter to the function being called, which then constructs the object, manipulates it, whatever.

However, this daisy-chains, so if we have the following code (where we fully define C before calling bar()):

class C
{
public:
    int x;
};

C c = bar ();
c.x = 4;

then space for c is allocated before bar() is called and the address of c is then passed as a hidden parameter to bar(), and then passed directly on to foo(), which finally fills constructs the object in the desired location. So, because bar() didn't actually do anything with this pointer (apart from pass it around) then all it cares about is the pointer itself, and not what it points to.

Or does it? Well, actually, yes, it does.

When returning a class object by value, small objects are usually returned in a register (or a pair of registers) as an optimisation. The compiler can get away with doing this in the majority of cases where the object is small enough (more on that in a moment).

But now, bar() needs to know whether this is what foo() is going to do, and to do that it needs, for various reasons, to see the full declaration of the class.

So, in summary, that's why the compiler needs a fully-defined type in order to call foo(), else it won't know what foo() will be expecting and so it doesn't know what code to generate. Not on most platforms anyway, end of story.

Notes:

  1. I looked at gcc and there are seem to be two (entirely logical) rules for determining whether a class object is returned in a register or pair of registers:

    • a) The object is 16 bytes or smaller (in a 64 bit build).
    • b) std::is_trivially_copyable<T>::value evaluates to true (maybe someone can find something in the standard about that).
  2. In case any readers don't know, RVO relies on constructing the object in its final resting place (i.e. in the location allocated by the caller). This is because there are objects (such as some implementations of std::basic_string, I believe) that are sensitive to being moved around in memory so you can't just construct them somewhere convenient to you and then memcpy them somewhere else.

  3. If constructing the returned object in that final location is not possible (because of the way you coded the function returning the object), then RVO doesn't happen (how can it?), see live demo below (make_no_RVO()).

  4. As a specific example of point 1b, if a small object contains data members that (might) point either to itself or to any of its data members, then returning it by value will get you into trouble if you don't declare it properly. Just adding an empty copy constructor will do, since then it is no longer trivially copyable. But then I guess that's true in general, don't hide important information from the compiler.

Live demo here. All comments on this post welcome, I will answer them to the best of my ability.

查看更多
可以哭但决不认输i
3楼-- · 2019-03-11 03:50

This has nothing to do with copy ellision. the foo is supposed to return a C value. As long as you just pass a reference or pointer to foo, it's OK. Once you try to call foo - as is the case in bar- the size of its arguments and return value must be at hand; the only valid way to know that is presenting a full declaration of the requiered type. Had the signature been using a reference or a pointer, all the required info was present and you could do without the full type declaration. This approach has a name: pimpl==Pointer to IMPLementaion, and it is widely used as a means of hiding details in closed-source library distros.

查看更多
ら.Afraid
4楼-- · 2019-03-11 03:54

Guaranteed copy elision has exceptions, for compatibility reasons and/or efficiency. Trivially copyable types may be copied even where copy elision would otherwise be guaranteed. You're right that if this doesn't apply, then the compiler would be able to generate correct code without knowing any details of C, not even its size. But the compiler does need to know if this applies, and for that, it still needs the type to be complete.

According to https://timsong-cpp.github.io/cppwp/class.temporary :

15.2 Temporary objects [class.temporary]

1 Temporary objects are created

[...]

(1.2) -- when needed by the implementation to pass or return an object of trivially-copyable type (see below), and

[...]

3 When an object of class type X is passed to or returned from a function, if each copy constructor, move constructor, and destructor of X is either trivial or deleted, and X has at least one non-deleted copy or move constructor, implementations are permitted to create a temporary object to hold the function parameter or result object. The temporary object is constructed from the function argument or return value, respectively, and the function's parameter or return object is initialized as if by using the non-deleted trivial constructor to copy the temporary (even if that constructor is inaccessible or would not be selected by overload resolution to perform a copy or move of the object). [ Note: This latitude is granted to allow objects of class type to be passed to or returned from functions in registers. -- end note ]

查看更多
一夜七次
5楼-- · 2019-03-11 03:56

The rule lies in [basic.lval]/9:

Unless otherwise indicated ([dcl.type.simple]), a prvalue shall always have complete type or the void type; ...

查看更多
登录 后发表回答