Anonymous function C++

2019-04-04 06:23发布

问题:

I am trying to use the function signal(int,void(*)(int)) from <csignal> to handle the floating point exception SIGFPE. I'd like to be able to print some useful diagnostics besides just a message saying "Floating point exception" or something to that effect. This means the function I pass as the handler to signal needs access to some of the data in my code. Therein lies the rub.

The function must return void and accept only 1 parameter of type int. I cannot make the handler a member function of my data storage class since then the type would be void(Foo::*)(int) due to the hidden this pointer.

I thought about using lambdas to try and make an anonymous function like this;

void handler(int nSig, Foo data)
{
    // do something
}
// snip
Foo data;
signal(SIGFPE, [&](int nSig)->void{handler(nSig,data);});

however because the lambda captures the variable data from outside the compiler will not let it be cast to a pointer to void(*)(int) (which is a shame as this seems like an ideal use for lambdas).

I could simply make data a global variable which could then be seen in handler but I am loath to do this for obvious reasons.

So my question is thus; what is the best way of mimicking anonymous functions in C++?

Note: I would prefer a native C++ solution and not to have to use boost or equivalent.

回答1:

There is no such thing as an anonymous function in C (C++ is irrelevant here, as the function must abide by the C calling convention).

The only thing you can do is shiver access globals from the handler, probably global variables (and not constants which would be fine).

I advise making those globals thread local to avoid multithreading issues, but it is still bad in the sense that global variables make for more brittle applications.


How to ?

Note: as Luc Danton patiently explained to me, a signal may interrupt any non-atomic activity, and thus reading from a global is safe only if it is a lock-free atomic (or a few other things). Unfortunately std::function may not be so, depending on your implementation, I will still leave this code to explain how it could be done providing that std::function accesses are atomic.

It is possible to create a trampoline that will call stateful stuff, isolating thread and allowing re-entrant calls.

typedef std::function<void(int)> SignalHandlerType;

extern thread_local ignalHandlerType SignalHandler;

And we create the following accessor (passed to signal):

void handle_signal(int const i) {
    if (SignalHandler) { SignalHandler(i); }
}

as well as the following RAII setter:

class SignalSetter: boost::noncopyable {
public:
    SignalSetter(int signal, SignalHandlerType&& sh):
        signal(signal), chandler(0), handler(sh)
    {
        chandler = std::signal(signal, &handle_signal<T>);
        swap(SignalHandler, handler);
    }

    ~SignalSetter() {
        std::signal(signal, chandler);
        swap(SignalHandler, handler);
    }

private:
    typedef void(*CHandlerType)(int);

    int signal;
    CHandlerType chandler;
    SignalHandlerType handler;
};

Note: both the global variable and the handle_signal could be private to the SignalSetter class... but since std::signal is not...

Expected usage:

int main(int argc, char* argv[]) {
    SignalSetter setter(SIGFPE, [argc, argv]() {
        std::cout << argc << ": " << argc << std::endl;
    });

    // do what you want.
}


回答2:

This is indeed a good question. Let's figure out what is going before blaming C++ though. Just think about how lambdas are implemented.

The most simple lambda is when no data is captured. If that is the case, its underlying type becomes a simple plain function. For example, a lambda like this:

[] (int p0) {}

will be an equivalent of a simple function:

void foo(int p0)
{
}

That actually perfectly works in case you want that lambda to become a function pointer. For example:

#include <string>
#include <csignal>
#include <iostream>

int main()
{
    int ret;
    signal(SIGINT, [](int signal) {
            std::cout << "Got signal " << signal << std::endl;
        });
    std::cin >> ret;
    return ret;
}

So far so good. But now you want to associate some data with your signal handler (by the way, the code above is undefined behavior as you can only execute signal-safe code inside a signal handler). So you want a lambda like:

#include <string>
#include <csignal>
#include <iostream>

struct handler_context {
    std::string code;
    std::string desc;
};

int main()
{
    int ret;
    handler_context ctx({ "SIGINT", "Interrupt" });
    signal(SIGINT, [&](int signal) {
            std::cout << "Got signal " << signal
                      << " (" << ctx.code << ": " << ctx.desc
                      << ")\n" << std::flush;
        });
    std::cin >> ret;
    return ret;
}

Let's forget for a moment about a syntactic sugar of C++ lambdas. It is no secret that you can "mimic" lambda even in C or assembler. So how would that look, actually? "Lambda" in C-style could look like this (this is still C++):

#include <string>
#include <cstdlib>
#include <iostream>

/*
 * This is a context associated with our lambda function.
 * Some dummy variables, for the sake of example.
 */
struct lambda_captures {
    int v0;
    int v1;
};

static int lambda_func(int p0, void *ctx) // <-- This is our lambda "function".
{
    lambda_captures *captures = (lambda_captures *)ctx;
    std::cout << "Got " << p0 << " (ctx: "
              << captures->v0 << ", " << captures->v1
              << ")\n" << std::flush;
    return 0;
}

// Below is an example of API function provided to the user that can
// invoke a callback supplied by the user.
static void some_api_function(int (*callback)(int p, void *data), void *data)
{
    callback(12345, data);
    callback(98765, data);
}

int main()
{
    lambda_captures captures;
    captures.v0 = 1986;
    captures.v1 = 2012;

    some_api_function(lambda_func, (void *)&captures);

    return EXIT_SUCCESS;
}

Above is a C style, C++ tends to pass "context" as "this", which is always an implicit first argument. If our API supported passing "data" as first argument, we could apply pointer to member conversion (PMF) and write something like this:

#include <string>
#include <cstdlib>
#include <iostream>

struct some_class {
    int v0;
    int v1;

    int func(int p0)
    {
        std::cout << "Got " << p0 << " (ctx: "
                  << v0 << ", " << v1
                  << ")\n" << std::flush;
        return p0;
    }
};

static void some_api_function(int (*callback)(void *data, int p), void *data)
{
    callback(data, 12345);
    callback(data, 98765);
}

int main()
{
    typedef int (*mpf_type)(void *, int);

    some_class clazz({ 1986, 2012 }); // <- Note a bit of a Java style :-)
    some_api_function((mpf_type)&some_class::func, (void *)&clazz);

    return EXIT_SUCCESS;
}

In the above two examples, note that "data" is always passed around. This is very important. If the API that is supposed to invoke your callback does not accept a "void *" pointer that is passed back to your callback somehow, there is no way you can associate any context with the callback. The only exception is global data. For example, this API is bad:

#include <string>
#include <cstdlib>
#include <iostream>

struct lambda_captures {
    int v0;
    int v1;
};

static int lambda_func(int p0)
{
/*
    // WHERE DO WE GET OUR "lambda_captures" OBJECT FROM????
    lambda_captures *captures = (lambda_captures *)ctx;
    std::cout << "Got " << p0 << " (ctx: "
              << captures->v0 << ", " << captures->v1
              << ")\n" << std::flush;
*/
    return 0;
}

// Below is an example of API function provided to the user that can
// invoke a callback supplied by the user.
static void some_api_function(int (*callback)(int p))
{
    callback(12345);
    callback(98765);
}

int main()
{
    lambda_captures captures;
    captures.v0 = 1986;
    captures.v1 = 2012;

    some_api_function(lambda_func /* How do we pass a context??? */);

    return EXIT_SUCCESS;
}

That being said, an old signal API is exactly like that. The only way to work around the problem is to actually put your "context" into a global scope. Then signal handler function can access it because the address is well known, for example:

#include <string>
#include <cstdlib>
#include <iostream>

struct lambda_captures {
    int v0;
    int v1;
};

lambda_captures captures({ 1986, 2012 }); // Whoa-la!!!

static int lambda_func(int p0)
{
    std::cout << "Got " << p0 << " (ctx: "
              << captures.v0 << ", " << captures.v1
              << ")\n" << std::flush;
    return 0;
}

// Below is an example of API function provided to the user that can
// invoke a callback supplied by the user.
static void some_api_function(int (*callback)(int p))
{
    callback(12345);
    callback(98765);
}

int main()
{
    some_api_function(lambda_func);

    return EXIT_SUCCESS;
}

This is what people have to deal with. Not only in case with signals API. This applies to other things as well. For example, interrupt handler processing. But that low-level programming where you have to deal with hardware. Of course, providing this sort of API in the user-space was not the best idea. And I will mention it again - there is only a small set of things you can do in a signal handler. You can only call async-signal-safe functions.

Of course, old API is not going away anytime soon because it is actually a POSIX standard. However, developers recognize the problem and there are better ways to handle signals. In Linux, for example, you can use eventfd to install a signal handler, associate it with arbitrary context and do whatever you want in the callback function.

At any rate, let's get back to the lambda you were playing with. The problem is not with C++, but with signals API that leaves no way for you to pass a context except using a global variable. That being said, it works with lambdas too:

#include <string>
#include <cstdlib>
#include <csignal>
#include <iostream>

struct some_data {
    std::string code;
    std::string desc;
};

static some_data data({ "SIGING", "Interrupt" });

int main()
{
    signal(SIGINT, [](int signal) {
            std::cout << "Got " << signal << " (" << data.code << ", "
                      << data.desc << ")\n" << std::flush;
        });
    return EXIT_SUCCESS;
}

Therefore, there is no shame in what C++ is doing here as it does a right thing.



回答3:

You cannot easily create a new static function in runtime, some JIT compilers libs are able to do this. If you need only a reasonable number of pointers, you can create some pool of static functions by specializing a template.

So easiest way is to wrap C++ Functors by a static function. The problem here is that there is no something like user data parameter. There is only one parameter, that is a number of signal. Since there are only 64 signals, you can create a static array of std::function< void(int) > and call each depending on signal number. Some simple example:

typedef std::function< void(int) > SignalFunc;

static std::array< SignalFunc, 64 > signalsFunc;

static void cHandler(int nSig)
{
    signalsFunc.at(nSig)(nSig);
}

SignalFunc RegisterSystemSignal( int sig, SignalFunc func )
{
    if( signal( sig, func ? &cHandler : (sighandler_t)SIG_DFL ) != SIG_ERR )
    { 
        func.swap( signalsFunc.at( sig ) );
        return func;
    }
    throw some_error();
}

So now you can do that:

RegisterSystemSignal(SIGFPE, [&](int nSig)->void{handler(nSig,data);});

There is also a sigaction witch have more features.