Any pointers on how to solve efficiently the following function in Haskell, for large numbers (n > 108)
f(n) = max(n, f(n/2) + f(n/3) + f(n/4))
I've seen examples of memoization in Haskell to solve fibonacci numbers, which involved computing (lazily) all the fibonacci numbers up to the required n. But in this case, for a given n, we only need to compute very few intermediate results.
Thanks
We can do this very efficiently by making a structure that we can index in sub-linear time.
But first,
Let's define
f
, but make it use 'open recursion' rather than call itself directly.You can get an unmemoized
f
by usingfix f
This will let you test that
f
does what you mean for small values off
by calling, for example:fix f 123 = 144
We could memoize this by defining:
That performs passably well, and replaces what was going to take O(n^3) time with something that memoizes the intermediate results.
But it still takes linear time just to index to find the memoized answer for
mf
. This means that results like:are tolerable, but the result doesn't scale much better than that. We can do better!
First, let's define an infinite tree:
And then we'll define a way to index into it, so we can find a node with index
n
in O(log n) time instead:... and we may find a tree full of natural numbers to be convenient so we don't have to fiddle around with those indices:
Since we can index, you can just convert a tree into a list:
You can check the work so far by verifying that
toList nats
gives you[0..]
Now,
works just like with list above, but instead of taking linear time to find each node, can chase it down in logarithmic time.
The result is considerably faster:
In fact it is so much faster that you can go through and replace
Int
withInteger
above and get ridiculously large answers almost instantaneouslyAs stated in Edward Kmett's answer, to speed things up, you need to cache costly computations and be able to access them quickly.
To keep the function non monadic, the solution of building an infinite lazy tree, with an appropriate way to index it (as shown in previous posts) fulfills that goal. If you give up the non-monadic nature of the function, you can use the standard associative containers available in Haskell in combination with “state-like” monads (like State or ST).
While the main drawback is that you get a non-monadic function, you do not have to index the structure yourself anymore, and can just use standard implementations of associative containers.
To do so, you first need to re-write you function to accept any kind of monad:
For your tests, you can still define a function that does no memoization using Data.Function.fix, although it is a bit more verbose:
You can then use State monad in combination with Data.Map to speed things up:
With minor changes, you can adapt the code to works with Data.HashMap instead:
Instead of persistent data structures, you may also try mutable data structures (like the Data.HashTable) in combination with the ST monad:
Compared to the implementation without any memoization, any of these implementation allows you, for huge inputs, to get results in micro-seconds instead of having to wait several seconds.
Using Criterion as benchmark, I could observe that the implementation with the Data.HashMap actually performed slightly better (around 20%) than that the Data.Map and Data.HashTable for which the timings were very similar.
I found the results of the benchmark a bit surprising. My initial feeling was that the HashTable would outperform the HashMap implementation because it is mutable. There might be some performance defect hidden in this last implementation.
Not the most efficient way, but does memoize:
when requesting
f !! 144
, it is checked thatf !! 143
exists, but its exact value is not calculated. It's still set as some unknown result of a calculation. The only exact values calculated are the ones needed.So initially, as far as how much has been calculated, the program knows nothing.
When we make the request
f !! 12
, it starts doing some pattern matching:Now it starts calculating
This recursively makes another demand on f, so we calculate
Now we can trickle back up some
Which means the program now knows:
Continuing to trickle up:
Which means the program now knows:
Now we continue with our calculation of
f!!6
:Which means the program now knows:
Now we continue with our calculation of
f!!12
:Which means the program now knows:
So the calculation is done fairly lazily. The program knows that some value for
f !! 8
exists, that it's equal tog 8
, but it has no idea whatg 8
is.Yet another addendum to Edward Kmett's answer: a self-contained example:
Use it as follows to memoize a function with a single integer arg (e.g. fibonacci):
Only values for non-negative arguments will be cached.
To also cache values for negative arguments, use
memoInt
, defined as follows:To cache values for functions with two integer arguments use
memoIntInt
, defined as follows:A solution without indexing, and not based on Edward KMETT's.
I factor out common subtrees to a common parent (
f(n/4)
is shared betweenf(n/2)
andf(n/4)
, andf(n/6)
is shared betweenf(2)
andf(3)
). By saving them as a single variable in the parent, the calculation of the subtree is done once.The code doesn't easily extend to a general memoization function (at least, I wouldn't know how to do it), and you really have to think out how subproblems overlap, but the strategy should work for general multiple non-integer parameters. (I thought it up for two string parameters.)
The memo is discarded after each calculation. (Again, I was thinking about two string parameters.)
I don't know if this is more efficient than the other answers. Each lookup is technically only one or two steps ("Look at your child or your child's child"), but there might be a lot of extra memory use.
Edit: This solution isn't correct yet. The sharing is incomplete.Edit: It should be sharing subchildren properly now, but I realized that this problem has a lot of nontrivial sharing:
n/2/2/2
andn/3/3
might be the same. The problem is not a good fit for my strategy.This is an addendum to Edward Kmett's excellent answer.
When I tried his code, the definitions of
nats
andindex
seemed pretty mysterious, so I write an alternative version that I found easier to understand.I define
index
andnats
in terms ofindex'
andnats'
.index' t n
is defined over the range[1..]
. (Recall thatindex t
is defined over the range[0..]
.) It works searches the tree by treatingn
as a string of bits, and reading through the bits in reverse. If the bit is1
, it takes the right-hand branch. If the bit is0
, it takes the left-hand branch. It stops when it reaches the last bit (which must be a1
).Just as
nats
is defined forindex
so thatindex nats n == n
is always true,nats'
is defined forindex'
.Now,
nats
andindex
are simplynats'
andindex'
but with the values shifted by 1: