Optimal way to return local value in C++11

2019-05-26 17:59发布

In the old days, if I wanted a string representation of an object A, I would write something with the signature void to_string(const A& a, string& out) to avoid extra copies. Is this still the best practice in C++11, with move semantics and all?

I have read several comments on other contexts that suggest relying on RVO and instead writing string to_string(const A& a). But RVO is not guaranteed to happen! So, how can I, as the programmer of to_string, guarantee the string is not copied around unnecessarily (independently of the compiler)?

3条回答
We Are One
2楼-- · 2019-05-26 18:28

Back in the old days (1970-1980) you could pretty much predict the performance of an algorithm by counting floating point divides.

That is no longer true today. However there is a similar rule you can use to estimate performance today:

Count the number of trips to the heap: both new/malloc and delete/free.

Given:

std::string
to_string(const A& a)
{
    std::string s;
    // fill it up
    return s;
}

std::string s = test();

I count 1 new, assuming you don't reallocate s internal to to_string(). That one allocation is done as you put data into s. I know that std::string has a fast (allocation-free) move constructor. So whether or not RVO happens is not relevant to estimating the performance of to_string(). There is going to be 1 allocation in creating the s outside of to_string().

Now consider:

void
to_string(const A& a, string& out)
{
    out = ...
}

std::string s;
to_string(a, s);

As I've written it, it still consumes 1 memory allocation. So this is about the same speed as the return-by-value version.

Now consider a new use-case:

while (i_need_to)
{
    std::string s = to_string(get_A());
    process(s);
    update(i_need_to);
}

According to our previous analysis the above is going to do 1 allocation per iteration. Now consider this:

std::string s;
while (i_need_to)
{
    to_string(get_A(), s);
    process(s);
    update(i_need_to);
}

I know that string has capacity(), and that capacity can be recycled over many uses in the above loop. The worst case scenario is that I still have 1 allocation per iteration. The best case scenario is that the first iteration will create capacity that is large enough to handle all other iterations, and that the entire loop will only do 1 allocation.

The truth will likely lie somewhere in between the worst and best case scenarios.

The best API will depend upon the use cases you think your function will most likely be in.

Count allocations to estimate performance. And then measure what you've coded up. In the case of std::string, there will likely be a short string buffer that may impact your decision. In the case of libc++, on a 64 bit platform, std::string will store up to 22 char (plus the terminating null) before it makes a trip to the heap.

查看更多
太酷不给撩
3楼-- · 2019-05-26 18:34

Assuming that the code in your function is of the form:

std::string data = ...;
//do some processing.
return data;

Then this is required to call std::string's move constructor if elision is not available. So worst-case, you get to move from your internal string.

If you can't afford the cost of a move operation, then you'll have to pass it as a reference.

That being said... do you worry about compilers not being able to inline short functions? Are you concerned about whether or not small wrappers won't be properly optimized away? Does the possibility of the compiler not optimizing for loops and the like bother you? Do you think about whether if(x < y) is faster than if(x - y < 0)?

If not... then why do you care about copy/move elision (the technical term for "return value optimization", as it's used in more places than that)? If you are using a compiler that can't support copy elision, then you're using a horrible compiler that probably can't support a ton of other optimizations. For performance sake, you'd be better off spending your time upgrading your compiler than turning return values into references.

preventing the improbable case of a copy actually happening is not worth the ... hassle? less readable code? what exactly? what is the extra thing that weights on the side of simple return?

The "extra thing" is that this:

std::string aString = to_string(a);

Is more readable than this:

std::string aString;
to_string(a, aString);

In the first case, it is immediately apparent that to_string is initializing a string. In the second, it isn't; you have to look up to_string's signature to see that it's taking a non-const reference.

The first case is not even "idiomatic"; that's how everyone would normally write it. You would never see a to_int(a, someInt) call for integers; that's ridiculous. Why should integer creation and object creation be so different? You shouldn't have to care as a programmer whether too many copies are happening for a return value or something. You just do things the simple, obvious, and well-understood way.

查看更多
\"骚年 ilove
4楼-- · 2019-05-26 18:34

Here is the answer I gathered from feedback and other resources:

The straightforward return by value is the idiom because:

  • in practice copy/move elision will take place most of the time;
  • the move ctor will be used on fallback;
  • preventing the improbable case of a copy actually happening is not worth the less readable code
  • passing the reference in requires the object to have already been created
    • not always feasible (for instance, there might be no default ctor) and which
    • one initialization too much also has to be taken into account if the issue is performance

However, if the typical usage is anticipated to be something like

std::string s;
while (i_need_to)
{
    to_string(get_A(), s);
    process(s);
    update(i_need_to);
}

and if the type in question has a default constructor*, then it may still make sense to pass the object that should hold the return by reference.

*considering string here only as an example, but the question and answers could be generalized

查看更多
登录 后发表回答