I'm wondering the best form for my constructors. Here is some sample code:
class Y { ... }
class X
{
public:
X(const Y& y) : m_y(y) {} // (a)
X(Y y) : m_y(y) {} // (b)
X(Y&& y) : m_y(std::forward<Y>(y)) {} // (c)
Y m_y;
}
Y f() { return ... }
int main()
{
Y y = f();
X x1(y); // (1)
X x2(f()); // (2)
}
From what I understand, this is the best the compiler can do in each situation.
(1a) y is copied into x1.m_y (1 copy)
(1b) y is copied into the argument of the constructor of X, and then copied into x1.m_y (2 copies)
(1c) y is moved into x1.m_y (1 move)
(2a) result of f() is copied into x2.m_y (1 copy)
(2b) f() is constructed into the argument of the constructor, and then copied to x2.m_y (1 copy)
(2c) f() is created on the stack, and then moved into x2.m_y (1 move)
Now a few questions:
On both counts, pass by const reference is no worse, and sometimes better than pass by value. This seems to go against the discussion on "Want Speed? Pass by Value.". For C++ (not C++0x), should I stick with pass by const reference for these constructors, or should I go pass by value? And for C++0x, should I do pass by rvalue reference over pass by value?
For (2), I'd prefer if the temporary was constructed directly into x.m_y. Even the rvalue version I think requires a move which, unless the object allocates dynamic memory, is as much work as a copy. Is there any way to code this so the compiler is permitted to avoid these copies and moves?
I've made a lot of assumptions in both what I think the compiler can do best and in my questions themselves. Please correct any of these if they are incorrect.
I've thrown together some examples. I used GCC 4.4.4 in all of this.
Simple case, without
-std=c++0x
First, I put together a very simple example with two classes that accept an
std::string
each.The output of that program on
stdout
is as follows:Conclusion
GCC was able to optimize each and every temporary
A
orB
away. This is consistent with the C++ FAQ. Basically, GCC may (and is willing to) generate code that constructsa, a2, b, b2
in place, even if a function is called that appearantly returns by value. Thereby GCC can avoid many of the temporaries whose existence one might have "inferred" by looking at the code.The next thing we want to see is how often
std::string
is actually copied in the above example. Let's replacestd::string
with something we can observe better and see.Realistic case, without
-std=c++0x
And the output, unfortunately, meets the expectation:
Conclusion
GCC was not able to optimize away the temporary
S
created byB
's constructor. Using the default copy constructor ofS
did not change that. Changingf, g
to bedid have the indicated effect. It appears that GCC is willing to construct the argument to
B
's constructor in place but hesitant to constructB
's member in place. Do note that still no temporaryA
orB
are created. That meansa, a2, b, b2
are still being constructed in place. Cool.Let's now investigate how the new move semantics may influence the second example.
Realistic case, with
-std=c++0x
Consider adding the following constructor to
S
And changing
B
's constructor toWe get this output
So, we were able to replace four copies with two moves by using pass by rvalue.
But we actually constructed a broken program.
Recall
g, g2
The marked location shows the problem. A move was done on an object that is not a temporary. That's because rvalue references behave like lvalue references except they may also bind to temporaries. So we must not forget to overload
B
's constructor with one that takes a constant lvalue reference.You will then notice that both
g, g2
cause "constructor2" to be called, since the symbols
in either case is a better fit for a const reference than for an rvalue reference. We can persuade the compiler to do a move ing
in either of two ways:Conclusions
Do return-by-value. The code will be more readable than "fill a reference I give you" code and faster and maybe even more exception safe.
Consider changing
f
toThat will meet the strong guarantee only if
A
's assignment provides it. The copy intoresult
cannot be skipped, neither cantmp
be constructed in place ofresult
, sinceresult
is not being constructed. Thus, it is slower than before, where no copying was necessary. C++0x compilers and move assignment operators would reduce the overhead, but it's still slower than to return-by-value.Return-by-value provides the strong guarantee more easily. The object is constructed in place. If one part of that fails and other parts have already been constructed, normal unwinding will clean up and, as long as
S
's constructor fulfills the basic guarantee with regard to its own members and the strong guarantee with regard to global items, the whole return-by-value process actually provides the strong guarantee.Always pass by value if you're going to copy (onto the stack) anyway
As discussed in Want speed? Pass by value.. The compiler may generate code that constructs, if possible, the caller's argument in place, eliminating the copy, which it cannot do when you take by reference and then copy manually. Principal example: Do NOT write this (taken from cited article)
but always prefer this
If you want to copy to a non-stack frame location pass by const reference pre C++0x and additionally pass by rvalue reference post C++0x
We already saw this. Pass by reference causes less copies to take place when in place construction is impossible than pass by value. And C++0x's move semantics may replace many copies with fewer and cheaper moves. But do keep in mind that moving will make a zombie out of the object that has been moved from. Moving is not copying. Just providing a constructor that accepts rvalue references may break things, as shown above.
If you want to copy to a non-stack frame location and have
swap
, consider passing by value anyway (pre C++0x)If you have cheap default construction, that combined with a
swap
may be more efficient than copying stuff around. ConsiderS
's constructor to be