Consider the following code:
#include <iostream>
#include <type_traits>
struct A
{
A() {}
A(const A&) { std::cout << "Copy" << std::endl; }
A(A&&) { std::cout << "Move" << std::endl; }
};
template <class T>
struct B
{
T x;
};
#define MAKE_B(x) B<decltype(x)>{ x }
template <class T>
B<T> make_b(T&& x)
{
return B<T> { std::forward<T>(x) };
}
int main()
{
std::cout << "Macro make b" << std::endl;
auto b1 = MAKE_B( A() );
std::cout << "Non-macro make b" << std::endl;
auto b2 = make_b( A() );
}
This outputs the following:
Macro make b
Non-macro make b
Move
Note that b1 is constructed without a move, but the construction of b2 requires a move.
I also need to type deduction, as A
in real life usage may be a complex type which is difficult to write explicitly. I also need to be able to nest calls (i.e. make_c(make_b(A()))
).
Is such a function possible?
Further thoughts:
N3290 Final C++0x draft page 284:
This elision of copy/move operations, called copy elision, is permitted in the following circumstances:
when a temporary class object that has not been bound to a reference (12.2) would be copied/moved to a class object with the same cv-unqualified type, the copy/move operation can be omitted by constructing the temporary object directly into the target of the omitted copy/move
Unfortunately this seems that we can't elide copies (and moves) of function parameters to function results (including constructors) as those temporaries are either bound to a reference (when passed by reference) or no longer temporaries (when passed by value). It seems the only way to elide all copies when creating a composite object is to create it as an aggregate. However, aggregates have certain restrictions, such as requiring all members be public, and no user defined constructors.
I don't think it makes sense for C++ to allow optimizations for POD C-structs aggregate construction but not allow the same optimizations for non-POD C++ class construction.
Is there any way to allow copy/move elision for non-aggregate construction?
My answer:
This construct allows for copies to be elided for non-POD types. I got this idea from David Rodríguez's answer below. It requires C++11 lambdas. In this example below I've changed make_b
to take two arguments to make things less trivial. There are no calls to any move or copy constructors.
#include <iostream>
#include <type_traits>
struct A
{
A() {}
A(const A&) { std::cout << "Copy" << std::endl; }
A(A&&) { std::cout << "Move" << std::endl; }
};
template <class T>
class B
{
public:
template <class LAMBDA1, class LAMBDA2>
B(const LAMBDA1& f1, const LAMBDA2& f2) : x1(f1()), x2(f2())
{
std::cout
<< "I'm a non-trivial, therefore not a POD.\n"
<< "I also have private data members, so definitely not a POD!\n";
}
private:
T x1;
T x2;
};
#define DELAY(x) [&]{ return x; }
#define MAKE_B(x1, x2) make_b(DELAY(x1), DELAY(x2))
template <class LAMBDA1, class LAMBDA2>
auto make_b(const LAMBDA1& f1, const LAMBDA2& f2) -> B<decltype(f1())>
{
return B<decltype(f1())>( f1, f2 );
}
int main()
{
auto b1 = MAKE_B( A(), A() );
}
If anyone knows how to achieve this more neatly I'd be quite interested to see it.
Previous discussion:
This somewhat follows on from the answers to the following questions:
Can creation of composite objects from temporaries be optimised away?
Avoiding need for #define with expression templates
Eliminating unnecessary copies when building composite objects
As Anthony has already mentioned, the standard forbids copy elision from the argument of a function to the return of the same function. The rationale that drives that decision is that copy elision (and move elision) is an optimization by which two objects in the program are merged into the same memory location, that is, the copy is elided by having both objects be one. The (partial) standard quote is below, followed by a set of circumstances under which copy elision is allowed, which do not include that particular case.
So what makes that particular case different? The difference is basically that the fact that there is a function call between the original and the copied objects, and the function call implies that there are extra constraints to consider, in particular the calling convention.
Given a function
T foo( T )
, and a user callingT x = foo( T(param) );
, in the general case, with separate compilation, the compiler will create an object$tmp1
in the location that the calling convention requires the first argument to be. It will then call the function and initializex
from the return statement. Here is the first opportunity for copy elision: by carefully placingx
on the location where the returned temporary is,x
and the returned object fromfoo
become a single object, and that copy is elided. So far so good. The problem is that the calling convention in general will not have the returned object and the parameter in the same location, and because of that,$tmp1
andx
cannot be a single location in memory.Without seeing the function definition the compiler cannot possibly know that the only purpose of the argument to the function is to serve as return statement, and as such it cannot elide that extra copy. It can be argued that if the function is
inline
then the compiler would have the missing extra information to understand that the temporary used to call the function, the returned value andx
are a single object. The problem is that that particular copy can only be elided if the code is actually inlined (not only if it is marked asinline
but actually inlined) If a function call is required, then the copy cannot be elided. If the standard allowed that copy to be elided when the code is inlined, it would imply that the behavior of a program would differ due to the compiler and not user code --theinline
keyword does not force inlining, it only means that multiple definitions of the same function do not represent a violation of the ODR.Note that if the variable was created inside the function (as compared to passed into it) as in:
T foo() { T tmp; ...; return tmp; } T x = foo();
then both copies can be elided: There is no restriction as of wheretmp
has to be created (it is not an input or output parameter to the function so the compiler is able to relocate it anywhere, including the location of the returned type, and on the calling side,x
can as in the previous example be carefully located in the location of that same return statement, which basically means thattmp
, the return statement andx
can be a single object.As of your particular problem, if you resort to a macro, the code is inlined, there are no restrictions on the objects and the copy can be elided. But if you add a function, you cannot elide the copy from the argument to the return statement. So just avoid it. Instead of using a template that will move the object, create a template that will construct an object:
And that copy can be elided by the compiler.
Note that I have not dealt with move construction, as you seem concerned on the cost of even move construction, even though I believe that you are barking at the wrong tree. Given a motivating real use case, I am quite sure that people here will come up with a couple of efficient ideas.
12.8/31
You cannot optimize out the copy/move of the
A
object from the parameter ofmake_b
to the member of the createdB
object.However, this is the whole point of move semantics --- by providing a light-weight move operation for
A
you can avoid a potentially expensive copy. e.g. ifA
was actuallystd::vector<int>
, then the copying of the vector's contents can be avoided by use of the move constructor, and instead just the housekeeping pointers will be transferred.This isn't a big problem. All it needs is changing the structure of the code slightly.
Instead of:
You can always do:
If the constructor of B is like this, then it'll not take any copies:
The composite case will work too:
And it doesn't even need c++0x features to do this.
No, it doesn't. The compiler is allowed to elide the move; whether that happens is implementation-specific, depending on several factors. It is also allowed to move, but it cannot copy (moving must be used instead of copying in this situation).
It is true that you are not guaranteed that the move will be elided. If you must be guaranteed that no move will occur, then either use the macro or investigate your implementation's options to control this behavior, particularly function inlining.