Automatic memoisation with support for recursive f

2019-06-12 10:19发布

问题:

The blog post Automatic Memoization in c++0x provides an function for producing a memoized version of an existing function. The blog post and the associated code have been discussed previously on stackoverflow (e.g. What does this C++11 code do?), however, none of these solutions is able to provide a fully universal memoizer which is able to correctly memoize recursive functions as well.

Sure, there is the trick of changing the recursive call by using something like this (assuming we have a memoizer such as the one presented in the blog post called memoize already in place):

std::function<int (int)> f;
int fib(int n) {
  if  (n < 2) return n;
  return f(n-1) + f(n-2);
}

int main(void) {
  f = memoize(std::function<int (int)>(fib));
}

But this feels more like a workaround than a proper solution because we still need access to the function which we want to memoize. A proper solution should be able to fully memoize any function including the ones which are defined in some library. However, producing such a solution, so it seems, is beyond my reach (assuming it is possible), therefore I'm asking:

Is a truly universal memoise function possible?
How can one achieve such a feat?

And if this is not possible, is there at least a way to generalise the above approach. Something along the lines of (does not compile and is not valid C++):

int fib(int n){
  if  (n < 2) return n;
  return this_func(n-1) + this_func(n-2);
}

Where this_func is something which is similar to the this pointer of a class but for a function. [Edit: This would probably still suffer from the problem that the this_func pointer would point to fib instead of the memoized fib]

回答1:

As the cache needs to be shared across function calls, you'd either have to pass it as an argument, or share it otherwise. One way to share it is using a function object:

struct fib
{
    std::map<std::tuple<int>, int> cache;

    int operator()(int n)
    {
        if(n < 2) return n;

        auto memoize = [this](int p)
        {
            auto i = cache.find(p);
            if(i == cache.end()) i = cache.insert({p, (*this)(p)}).first;
            return i->second;
        };

        return memoize(n-1) + memoize(n-2);
    }
};

Where you can factor out the memoize part.

There's also a trick with temporary lifetime to pass the memoized function as an argument; something like this:

struct recurse // possibly a class template
{
    std::function<int(int, recurse const&)> f; // possibly `mutable`

    template<class T>
    recurse(T&& p) : f( memoize(decltype(f){p}) )
    {}

    int operator()(int x) const
    {
        return f(x, *this);
    }
};

int fib(int n, recurse const& f);

int fib(int n, recurse const& f = {fib})
{
    if(n < 2) return n;
    return f(n-1) + f(n-2); // or `fib(n-1, f) + fib(n-2, f)`
}

Yet, this requires a change of memoize, as the recurse const& cannot (and shouldn't) be part of the internal map.

N.B. those const& could also be && for lifetime extension, yet, that might be confusing due to move semantics



回答2:

Following may help, but it is still a hack, and it is ugly... :

int fib(void* f, int n)
{
  if (n < 2) return n;
  auto this_func = *reinterpret_cast<std::function<int (void*, int)>*>(f);
  return this_func(f, n - 1) + this_func(f, n - 2);
}


int main(int argc, char *argv[])
{
    std::function<int (void*, int n)> fib1 = &fib;
    std::cout << fib1(reinterpret_cast<void*>(&fib1), 5) << std::endl;

    auto f = memoize(fib1);
    std::cout << f(reinterpret_cast<void*>(&f), 5) << std::endl;

    return 0;
}

I use void* since the correct signature is recursive :-/



回答3:

My guess is that this is not possible while staying within the bounds of defined behavior, because there's no way to abstract out the recursive call. In fact, I can't think of anything significantly better than the global variable version that you provided.

One obvious idea to provide an abstraction point more robust than a global variable would be to add the function to recurse on as a first argument:

int fib(std::function<int (int)> rec, int n)
{
    if (n < 2) return n;
    return rec(n - 1) + rec(n - 2);
}

Then, you could modify your memoize function to pass the memoized version into the first argument, and things should just work. This trick is often used to accomplish the same goal in (unityped/untyped) functional languages like scheme.

However, this sort of trick relies on something called the "Y combinator" which I think can't exist in C++ as it has an infinite type.