I can see why the auto
type in C++11 improves correctness and maintainability. I've read that it can also improve performance (Almost Always Auto by Herb Sutter), but I miss a good explanation.
- How can
auto
improve performance? - Can anyone give an example?
auto
can aid performance by avoiding silent implicit conversions. An example I find compelling is the following.See the bug? Here we are, thinking we're elegantly taking every item in the map by const reference and using the new range-for expression to make our intent clear, but actually we're copying every element. This is because
std::map<Key, Val>::value_type
isstd::pair<const Key, Val>
, notstd::pair<Key, Val>
. Thus, when we (implicitly) have:Instead of taking a reference to an existing object and leaving it at that, we have to do a type conversion. You are allowed to take a const reference to an object (or temporary) of a different type as long as there is an implicit conversion available, e.g.:
The type conversion is an allowed implicit conversion for the same reason you can convert a
const Key
to aKey
, but we have to construct a temporary of the new type in order to allow for that. Thus, effectively our loop does:(Of course, there isn't actually a
__tmp
object, it's just there for illustration, in reality the unnamed temporary is just bound toitem
for its lifetime).Just changing to:
just saved us a ton of copies - now the referenced type matches the initializer type, so no temporary or conversion is necessary, we can just do a direct reference.
Because
auto
deduces the type of the initializing expression, there is no type conversion involved. Combined with templated algorithms, this means that you can get a more direct computation than if you were to make up a type yourself – especially when you are dealing with expressions whose type you cannot name!A typical example comes from (ab)using
std::function
:With
cmp2
andcmp3
, the entire algorithm can inline the comparison call, whereas if you construct astd::function
object, not only can the call not be inlined, but you also have to go through the polymorphic lookup in the type-erased interior of the function wrapper.Another variant on this theme is that you can say:
This is always a reference, bound to the value of the function call expression, and never constructs any additional objects. If you didn't know the returned value's type, you might be forced to construct a new object (perhaps as a temporary) via something like
T && f = MakeAThing()
. (Moreover,auto &&
even works when the return type is not movable and the return value is a prvalue.)The existing three answers give examples where using
auto
helps “makes it less likely to unintentionally pessimize” effectively making it "improve performance".There is a flip side to the the coin. Using
auto
with objects that have operators that don't return the basic object can result in incorrect (still compilable and runable) code. For example, this question asks how usingauto
gave different (incorrect) results using the Eigen library, i.e. the following linesresulted in different output. Admittedly, this is mostly due to Eigens lazy evaluation, but that code is/should be transparent to the (library) user.
While performance hasn't been greatly affected here, using
auto
to avoid unintentional pessimization might be classified as premature optimization, or at least wrong ;).There are two categories.
auto
can avoid type erasure. There are unnamable types (like lambdas), and almost unnamable types (like the result ofstd::bind
or other expression-template like things).Without
auto
, you end up having to type erase the data down to something likestd::function
. Type erasure has costs.task1
has type erasure overhead -- a possible heap allocation, difficulty inlining it, and virtual function table invocation overhead.task2
has none. Lambdas need auto or other forms of type deduction to store without type erasure; other types can be so complex that they only need it in practice.Second, you can get types wrong. In some cases, the wrong type will work seemingly perfectly, but will cause a copy.
will compile if
expression()
returnsBar const&
orBar
or evenBar&
, whereFoo
can be constructed fromBar
. A temporaryFoo
will be created, then bound tof
, and its lifetime will be extended untilf
goes away.The programmer may have meant
Bar const& f
and not intended to make a copy there, but a copy is made regardless.The most common example is the type of
*std::map<A,B>::const_iterator
, which isstd::pair<A const, B> const&
notstd::pair<A,B> const&
, but the error is a category of errors that silently cost performance. You can construct astd::pair<A, B>
from astd::pair<const A, B>
. (The key on a map is const, because editing it is a bad idea)Both @Barry and @KerrekSB first illustrated these two principles in their answers. This is simply an attempt to highlight the two issues in one answer, with wording that aims at the problem rather than being example-centric.