C++ STL: Which method of iteration over a STL cont

2019-01-17 22:51发布

问题:

This may seem frivolous to some of you, but which of the following 2 methods of iteration over a STL container is better? Why?

class Elem;
typedef vector<Elem> ElemVec;
ElemVec elemVec;

// Method 0
for (ElemVec::iterator i = elemVec.begin(); i != elemVec.end(); ++i)
{
    Elem& e = *i;
    // Do something
}

// Method 1
for (int i = 0; i < elemVec.size(); ++i)
{
    Elem& e = elemVec.at(i);
    // Do something
}

Method 0 seems like cleaner STL, but Method 1 achieves the same with lesser code. Simple iteration over a container is what appears all over the place in any source code. So, I'm inclined to pick Method 1 which seems to reduce visual clutter and code size.

PS: I know iterators can do much more than a simple index. But, please keep the reply/discussion focused on simple iteration over a container like shown above.

回答1:

The first version works with any container and so is more useful in template functions that take any container a s a parameter. It is also conceivably slightly more efficient, even for vectors.

The second version only works for vectors and other integer-indexed containers. It'd somewhat more idiomatic for those containers, will be easily understood by newcomers to C++, and is useful if you need to do something else with the index, which is not uncommon.

As usual, there is no simple "this one is better" answer, I'm afraid.



回答2:

If you don't mind a (very?) small loss of efficiency, i'd recommend using Boost.Foreach

BOOST_FOREACH( Elem& e, elemVec )
{
    // Your code
}


回答3:

Method 0 is faster and therefore recommended.

Method 1 uses size() which is allowed to be O(1), depending on the container and the stl implementation.



回答4:

Some more advantages of method 0:

  • if you move from vector to another container the loop remains the same,
  • easy to move from iterator to const_iterator if you need,
  • when c++0x will arive, auto typing will reduce some of the code clutter.

The main disadvantage is that in many cases you scan two containers, in which case an index is cleaner than keeping two iterators.



回答5:

The following method of iteration over a standard library container is best.

Use c++11 (and beyond)'s range-based for-loop with the auto specifier:

// Method 2
for (auto& e: elemVec)
{
    // Do something with e...
}

This is similar to your Method 0 but requires less typing, less maintenence and works with any container compatible with std::begin() and std::end(), including plain-old arrays.



回答6:

Method 0, for several reasons.

  • It better expresses your intent, which aids compiler optimizations as well as readability
  • Off-by-one errors are less likely
  • It works even if you replace the vector with a list, which doesn't have operator[]

Of course, the best solution will often be solution 2: One of the std algorithms. std::for_each, std::transform, std::copy or whatever else you need. This has some further advantages:

  • It expresses your intent even better, and allows some significant compiler optimizations (MS's secure SCL performs bounds checking on your methods 0 and 1, but will skip it on std algorithms)
  • It's less code (at the call site, at least. Of course you have to write a functor or something to replace the loop body, but at the use site, the code gets cleaned up quite a bit, which is probably where it matters most.

In general, avoid overspecifying your code. Specify exactly what you want done, and nothing else. The std algorithms are usually the way to go there, but even without them, if you don't need the index i, why have it? Use iterators instead, in that case.



回答7:

Coincidentally I made a speed test recently (filling 10 * 1024 * 1024 ints with rand() ).
These are 3 runs, time in nano-seconds

vect[i] time      : 373611869  
vec.at(i) time    : 473297793  
*it = time        : 446818590  
arr[i] time       : 390357294  
*ptr time         : 356895778  

UPDATE : added stl-algorithm std::generate, which seems to run the fastest, because of special iterator-optimizing (VC++2008). time in micro-seconds.

vect[i] time      : 393951
vec.at(i) time    : 551387
*it = time        : 596080
generate = time   : 346591
arr[i] time       : 375432
*ptr time         : 334612

Conclusion : Use standard-algorithms, they might be faster than a explicit loop ! (and also good practice)

Update : the above times were in a I/O-bound situation, I did the same tests with a CPU-bound (iterate over a relatively short vector, which should fit in cache repeatedly, multiply each element by 2 and write back to vector)

//Visual Studio 2008 Express Edition
vect[i] time      : 1356811
vec.at(i) time    : 7760148
*it = time        : 4913112
for_each = time   : 455713
arr[i] time       : 446280
*ptr time         : 429595

//GCC
vect[i] time      : 431039
vec.at(i) time    : 2421283
*it = time        : 381400
for_each = time   : 380972
arr[i] time       : 363563
*ptr time         : 365971  

Interestingly iterators and operator[] is considerably slower in VC++ compared to for_each (which seems to degrade the iterators to pointers through some template-magic for performance).
In GCC access times are only worse for at(), which is normal, because it's the only range-checked function of the tests.



回答8:

It depends on which type of container. For a vector, it probably doesn't matter which you use. Method 0 has become more idiomatic, but their isn't a big difference, as everyone says.

If you decided to use a list, instead, method 1 would, in principle, be O(N), but in fact there is no list at() method, so you can't even do it that way. (So at some level your question stacks the deck.)

But that in itself is another argument for method 0: it uses the same syntax for different containers.



回答9:

A possibility not considered above: depending on the details of "Do something", one can have method 0 and method 1 simultaneously, you don't have to choose:

for (auto i = elemVec.begin(), ii = 0; ii < elemVec.size(); ++i, ++ii)
{
    // Do something with either the iterator i or the index ii
}

This way, finding the index or accessing the corresponding member are both obtained with trivial complexity.