How are C++11 lambdas represented and passed?

2019-01-23 02:11发布

问题:

Passing a lambda is really easy in c++11:

func( []( int arg ) {
  // code
} ) ;

But I'm wondering, what is the cost of passing a lambda to a function like this? What if func passes the lambda to other functions?

void func( function< void (int arg) > f ) {
  doSomethingElse( f ) ;
}

Is the passing of the lambda expensive? Since a function object can be assigned 0,

function< void (int arg) > f = 0 ; // 0 means "not init" 

it leads me to think that function objects kind of act like pointers. But without use of new, then it means they might be like value-typed struct or classes, which defaults to stack allocation and member-wise copy.

How is a C++11 "code body" and group of captured variables passed when you pass a function object "by value"? Is there a lot of excess copy of the code body? Should I have to mark each function object passed with const& so that a copy is not made:

void func( const function< void (int arg) >& f ) {
}

Or do function objects somehow pass differently than regular C++ structs?

回答1:

Disclaimer: my answer is somewhat simplified compared to the reality (I put some details aside) but the big picture is here. Also, the Standard does not fully specify how lambdas or std::function must be implemented internally (the implementation has some freedom) so, like any discussion on implementation details, your compiler may or may not do it exactly this way.

But again, this is a subject quite similar to VTables: the Standard doesn't mandate much but any sensible compiler is still quite likely to do it this way, so I believe it is worth digging into it a little. :)


Lambdas

The most straightforward way to implement a lambda is kind of an anonymous struct:

auto lambda = [](Args...) -> Return { /*...*/ };

// roughly equivalent to:
struct {
    Return operator ()(Args...) { /*...*/ }
}
lambda; // instance of the anonymous struct

Just like any other class, when you pass its instances around you never have to copy the code, just the actual data (here, none at all).


Objects captured by value are copied into the struct:

Value v;
auto lambda = [=](Args...) -> Return { /*... use v, captured by value...*/ };

// roughly equivalent to:
struct Temporary { // note: we can't make it an anonymous struct any more since we need
                   // a constructor, but that's just a syntax quirk

    const Value v; // note: capture by value is const by default unless the lambda is mutable
    Temporary(Value v_) : v(v_) {}
    Return operator ()(Args...) { /*... use v, captured by value...*/ }
}
lambda(v); // instance of the struct

Again, passing it around only means that you pass the data (v) not the code itself.


Likewise, objects captured by reference are referenced into the struct:

Value v;
auto lambda = [&](Args...) -> Return { /*... use v, captured by reference...*/ };

// roughly equivalent to:
struct Temporary {
    Value& v; // note: capture by reference is non-const
    Temporary(Value& v_) : v(v_) {}
    Return operator ()(Args...) { /*... use v, captured by reference...*/ }
}
lambda(v); // instance of the struct

That's pretty much all when it comes to lambdas themselves (except the few implementation details I ommitted, but which are not relevant to understanding how it works).


std::function

std::function is a generic wrapper around any kind of functor (lambdas, standalone/static/member functions, functor classes like the ones I showed, ...).

The internals of std::function are pretty complicated because they must support all those cases. Depending on the exact type of functor this requires at least the following data (give or take implementation details):

  • A pointer to a standalone/static function.

Or,

  • A pointer to a copy[see note below] of the functor (dynamically allocated to allow any type of functor, as you rightly noted it).
  • A pointer to the member function to be called.
  • A pointer to an allocator that is able to both copy the functor and itself (since any type of functor can be used, the pointer-to-functor should be void* and thus there has to be such a mechanism -- probably using polymorphism aka. base class + virtual methods, the derived class being generated locally in the template<class Functor> function(Functor) constructors).

Since it doesn't know beforehand which kind of functor it will have to store (and this is made obvious by the fact that std::function can be reassigned) then it has to cope with all possible cases and make the decision at runtime.

Note: I don't know where the Standard mandates it but this is definitely a new copy, the underlying functor is not shared:

int v = 0;
std::function<void()> f = [=]() mutable { std::cout << v++ << std::endl; };
std::function<void()> g = f;

f(); // 0
f(); // 1
g(); // 0
g(); // 1

So, when you pass a std::function around it involves at least those four pointers (and indeed on GCC 4.7 64 bits sizeof(std::function<void()> is 32 which is four 64 bits pointers) and optionally a dynamically allocated copy of the functor (which, as I already said, only contains the captured objects, you don't copy the code).


Answer to the question

what is the cost of passing a lambda to a function like this?[context of the question: by value]

Well, as you can see it depends mainly on your functor (either a hand-made struct functor or a lambda) and the variables it contains. The overhead compared to directly passing a struct functor by value is quite negligible, but it is of course much higher than passing a struct functor by reference.

Should I have to mark each function object passed with const& so that a copy is not made?

I'm afraid this is very hard to answer in a generic way. Sometimes you'll want to pass by const reference, sometimes by value, sometimes by rvalue reference so that you can move it. It really depends on the semantics of your code.

The rules concerning which one you should choose are a totally different topic IMO, just remember that they are the same as for any other object.

Anyway, you now have all the keys to make an informed decision (again, depending on your code and its semantics).



回答2:

See also C++11 lambda implementation and memory model

A lambda-expression is just that: an expression. Once compiled, it results in a closure object at runtime.

5.1.2 Lambda expressions [expr.prim.lambda]

The evaluation of a lambda-expression results in a prvalue temporary (12.2). This temporary is called the closure object.

The object itself is implementation-defined and may vary from compiler to compiler.

Here is the original implementation of lambdas in clang https://github.com/faisalv/clang-glambda



回答3:

If the lambda can be made as a simple function (i.e. it does not capture anything), then it is made exactly the same way. Especially as standard requires it to be compatible with the old-style pointer-to-function with the same signature. [EDIT: it's not accurate, see discussion in comments]

For the rest it is up to the implementation, but I'd not worry ahead. The most straightforward implementation does nothing but carry the information around. Exactly as much as you asked for in the capture. So the effect would be the same as if you did it manually creating a class. Or use some std::bind variant.