I used to ask a similar question once. Now I'll be more specific. The purpose is to learn a Haskell idiom to write iterative algorithms with monadic results. In particular, this might be useful for implementing all kinds of randomized algorithms, such as genetic algorithms and a like.
I wrote an example program that manifests my problem with such algorithms in Haskell. Its complete source is on hpaste.
The key point is to update an element randomly (thus the result is in State StdGen
or some other monad):
type RMonad = State StdGen
-- An example of random iteration step: one-dimensional random walk.
randStep :: (Num a) => a -> RMonad a
randStep x = do
rnd <- get
let (goRight,rnd') = random rnd :: (Bool, StdGen)
put rnd'
if goRight
then return (x+1)
else return (x-1)
And then one needs to update many elements, and repeat the process many, many times. And here is a problem. As every step is a monad action (:: a -> m a
), repeated many times, it's important to compose such actions effectively (forgetting the previous step quickly). From what I learned from my previous quesion (Composing monad actions with folds), seq
and deepseq
help a lot to compose monadic actions. So I do:
-- Strict (?) iteration.
iterateM' :: (NFData a, Monad m) => Int -> (a -> m a) -> a -> m a
iterateM' 0 _ x = return $!! x
iterateM' n f x = (f $!! x) >>= iterateM' (n-1) f
-- Deeply stict function application.
($!!) :: (NFData a) => (a -> b) -> a -> b
f $!! x = x `deepseq` f x
It is certainly better than lazy composition. Unfortunately, it is not enough.
-- main seems to run in O(size*iters^2) time...
main :: IO ()
main = do
(size:iters:_) <- liftM (map read) getArgs
let start = take size $ repeat 0
rnd <- getStdGen
let end = flip evalState rnd $ iterateM' iters (mapM randStep) start
putStr . unlines $ histogram "%.2g" end 13
When I measured time required to finish this program, it appears, that it is similar to O(N^2) with respect to the number of iterations (memory allocation seems to be acceptable). This profile should be flat and constant for linear asymptotics:
quadratic time per update http://i29.tinypic.com/i59blv.png
And this is how a heap profile looks:
heap profile with -hc http://i30.tinypic.com/124a8fc.png
I assume that such a program should run with very modest memory requirements, and it should take time proportional to the number of iterations. How can I achieve that in Haskell?
The complete runnable source of the example is here.
Some things to consider:
For raw all-out performance, write a custom State monad, like so:
Like so: http://hpaste.org/fastcgi/hpaste.fcgi/view?id=27414#a27414
Which runs in constant space (modulo the
[Double]
you build up), and is some 8x faster than your original.The use of a specialized state monad with local defintion outperforms the Control.Monad.Strict significantly as well.
Here's what the heap looks like, with the same paramters as you:
Note that it is about 10x faster, and uses 1/5th the space. The big red thing is your list of doubles being allocated.
Inspired by your question, I captured the PureMT pattern in a new package: monad-mersenne-random, and now your program becomes this:
The other change I made was to worker/wrapper transform iterateM, enabling it to be inlined:
Overall, this brings your code from, with K=500, N=30k
So that is, 220x faster.
The heap is a bit better too, now that iterateM unboxes.
This is probably a small point compared to the other answers, but is your ($!!) function correct?
You define
This will fully evaluate the argument, however the function result won't necessarily be evaluated at all. If you want the
$!!
operator to apply the function and fully evaluate the result, I think it should be:Importing Control.Monad.State.Strict instead of Control.Monad.State yields a significant performance improvement. Not sure what you're looking for in terms of asymptotics, but this might get you there.
Additionally, you get a performance increase by swapping the iterateM and the mapM so that you don't keep traversing the list, you don't have to hold on to the head of the list, and you don't need to deepseq through the list, but just force the individual results. I.e.:
If you do so, then you can change iterateM to be much more idiomatic as well:
This of course requires the bang patterns language extension.