What I understand, Haskell have green threads. But how light weight are they. Is it possible to create 1 million threads?
Or How long would it take for 100 000 threads?
What I understand, Haskell have green threads. But how light weight are they. Is it possible to create 1 million threads?
Or How long would it take for 100 000 threads?
from here.
import Control.Concurrent
import Control.Monad
n = 100000
main = do
left <- newEmptyMVar
right <- foldM make left [0..n-1]
putMVar right 0 -- bang!
x <- takeMVar left -- wait for completion
print x
where
make l n = do
r <- newEmptyMVar
forkIO (thread n l r)
return r
thread :: Int -> MVar Int -> MVar Int -> IO ()
thread _ l r = do
v <- takeMVar r
putMVar l $! v+1
on my not quite 2.5gh laptop this takes less than a second.
set n to 1000000 and it becomes hard to write the rest of this post because the OS is paging like crazy. definitely using more than a gig of ram (didn't let it finish). If you have enough RAM it would definitely work in the appropriate 10x the time of the 100000 version.
Well according to here the default stack size is 1k, so I suppose in theory it would be possible to create 1,000,000 threads - the stack would take up around 1Gb of memory.
Using the benchmark here, http://www.reddit.com/r/programming/comments/a4n7s/stackless_python_outperforms_googles_go/c0ftumi
You can improve the performance on a per benchmark-basis by shrinking the thread stack size to one that fits the benchmark. E.g. 1M threads, with a 512 byte stack per thread, takes 2.7s
$ time ./A +RTS -s -k0.5k
For this synthetic test case, spawning hardware threads results in significant overheads. Working just with green threads looks like a preferred option. Note that spawning green threads in Haskell is indeed cheap. I've re-run the above program, with n = 1m on MacBook Pro, i7, 8GB of RAM, using:
$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.6.3
Compiled with -threaded and -rtsopts:
$ time ./thr
1000000
real 0m5.974s
user 0m3.748s
sys 0m2.406s
Reducing the stack helps a bit:
$ time ./thr +RTS -k0.5k
1000000
real 0m4.804s
user 0m3.090s
sys 0m1.923s
Then, compiled without -threaded:
$ time ./thr
1000000
real 0m2.861s
user 0m2.283s
sys 0m0.572s
And finally, without -threaded and with reduced stack:
$ time ./thr +RTS -k0.5k
1000000
real 0m2.606s
user 0m2.198s
sys 0m0.404s