Performance of pIter != cont.end() in for loop

2019-02-16 05:22发布

I was getting through "Exceptional C++" by Herb Sutter lately, and I have serious doubts about a particular recommendation he gives in Item 6 - Temporary Objects.

He offers to find unnecessary temporary objects in the following code:

string FindAddr(list<Employee> emps, string name) 
{
  for (list<Employee>::iterator i = emps.begin(); i != emps.end(); i++)
  {
    if( *i == name )
    {
      return i->addr;
    }
  }
  return "";
}

As one of the example, he recommends to precompute the value of emps.end() before the loop, since there is a temporary object created on every iteration:

For most containers (including list), calling end() returns a temporary object that must be constructed and destroyed. Because the value will not change, recomputing (and reconstructing and redestroying) it on every loop iteration is both needlessly inefficient and unaesthetic. The value should be computed only once, stored in a local object, and reused.

And he suggests replacing by the following:

list<Employee>::const_iterator end(emps.end());
for (list<Employee>::const_iterator i = emps.begin(); i != end; ++i)

For me, this is unnecessary complication. Even if one replaces ugly type declarations with compact auto, he still gets two lines of code instead of one. Even more, he has this end variable in the outer scope.

I was sure modern compilers will optimize this piece of code anyway, because I'm actually using const_iterator here and it is easy to check whether the loop content is accessing the container somehow. Compilers got smarter within the last 13 years, right?

Anyway, I will prefer the first version with i != emps.end() in most cases, where I'm not so much worried about performance. But I want to know for sure, whether this is a kind of construction I could rely on a compiler to optimize?

Update

Thanks for your suggestions on how to make this useless code better. Please note, my question is about compiler, not programming techniques. The only relevant answers for now are from NPE and Ellioh.

7条回答
干净又极端
2楼-- · 2019-02-16 05:47

Use std algorithms

He's right of course; calling end can instantiate and destroy a temporary object, which is generally bad.

Of course, the compiler can optimise this away in a lot of cases.

There is a better and more robust solution: encapsulate your loops.

The example you gave is in fact std::find, give or take the return value. Many other loops also have std algorithms, or at least something similar enough that you can adapt - my utility library has a transform_if implementation, for example.

So, hide loops in a function and take a const& to end. Same fix as your example, but much much cleaner.

查看更多
戒情不戒烟
3楼-- · 2019-02-16 05:53

A couple of things... the first is that in general the cost of building an iterator (in Release mode, unchecked allocators) is minimal. They are usually wrappers around a pointer. With checked allocators (default in VS) you might have some cost, but if you really need the performance, after testing rebuild with unchecked allocators.

The code need not be as ugly as what you posted:

for (list<Employee>::const_iterator it=emps.begin(), end=emps.end(); 
                                    it != end; ++it )

The main decision on whether you want to use one or the other approaches should be in terms of what operations are being applied to the container. If the container might be changing it's size then you might want to recompute the end iterator in each iteration. If not, you can just precompute once and reuse as in the code above.

查看更多
叼着烟拽天下
4楼-- · 2019-02-16 05:55

Containers like vector returns variable, which stores pointer to the end, on end() call, that optimized. If you've written container which does some lookups, etc on end() call consider writing

for (list<Employee>::const_iterator i = emps.begin(), end = emps.end(); i != end; ++i)
{
...
}

for speed

查看更多
来,给爷笑一个
5楼-- · 2019-02-16 06:01

Contrary to popular belief, I don't see any difference between VC++ and gcc in this respect. I did a quick check with both g++ 4.7.2 and MS C++ 17 (aka VC++ 2012).

In both cases I compared the code generated with the code as in the question (with headers and such added to let it compile), to the following code:

string FindAddr(list<Employee> emps, string name) 
{
    auto end = emps.end();
    for (list<Employee>::iterator i = emps.begin(); i != end; i++)
    {
        if( *i == name )
        {
            return i->addr;
        }
    }
    return "";
}

In both cases the result was essentially identical for the two pieces of code. VC++ includes line-number comments in the code, which changed because of the extra line, but that was the only difference. With g++ the output files were identical.

Doing the same with std::vector instead of std::list, gave pretty much the same result -- no significant difference. For some reason, g++ did switch the order of operands for one instruction, from cmp esi, DWORD PTR [eax+4] to cmp DWORD PTR [eax+4], esi, but (again) this is utterly irrelevant.

Bottom line: no, you're not likely to gain anything from manually hoisting the code out of the loop with a modern compiler (at least with optimization enabled -- I was using /O2b2 with VC++ and /O3 with g++; comparing optimization with optimization turned off seems pretty pointless to me).

查看更多
爱情/是我丢掉的垃圾
6楼-- · 2019-02-16 06:06

I've compiled the following slightly hacky code using g++ 4.7.2 with -O3 -std=c++11, and got identical assembly for both functions:

#include <list>
#include <string>

using namespace std;

struct Employee: public string { string addr; };

string FindAddr1(list<Employee> emps, string name)
{
  for (list<Employee>::const_iterator i = emps.begin(); i != emps.end(); i++)
  {
    if( *i == name )
    {
      return i->addr;
    }
  }
  return "";
}

string FindAddr2(list<Employee> emps, string name)
{
  list<Employee>::const_iterator end(emps.end());
  for (list<Employee>::const_iterator i = emps.begin(); i != end; i++)
  {
    if( *i == name )
    {
      return i->addr;
    }
  }
  return "";
}

In any event, I think the choice between the two versions should be primarily based on grounds of readability. Without profiling data, micro-optimizations like this to me look premature.

查看更多
甜甜的少女心
7楼-- · 2019-02-16 06:07

If you really need the performance, you let your shiny new C++11 compiler write it for you:

for (const auto &i : emps) {
    /* ... */
}

Yes, this is tongue-in-cheek (sort of). Herb's example here is now out of date. But since your compiler doesn't support it yet, let's get to the real question:

Is this a kind of construction I could rely on a compiler to optimize?

My rule of thumb is that the compiler writers are way smarter than I am. I can't rely on a compiler to optimize any one piece of code, because it might choose to optimize something else that has a bigger impact. The only way to know for sure is to try out both approaches on your compiler on your system and see what happens. Check your profiler results. If the call to .end() sticks out, save it in a separate variable. Otherwise, don't worry about it.

查看更多
登录 后发表回答