I'm looking to map a modestly-expensive function onto a large lazy seq in parallel. pmap
is great but i'm loosing to much to context switching. I think I need to increase the size of the chunk of work thats passed to each thread.
I wrote on a function to break the seq into chunks and pmap the function onto each chunk and recombine them. this 'works' but the results have not been spectacular. The origional code essentially looks like this:
(pmap eval-polynomial (range x) coificients)
How can I really squeez this while keeping it lazy?
I'd look at the ppmap function from: http://www.braveclojure.com/zombie-metaphysics/. It lets you pmap while specifying the chunk size.
I would look at the Fork/Join library, set to be integrated into JDK 7. It's a lightweight threading model optimized for nonblocking, divide-and-conquer computations over a dataset, using a thread pool, a work-stealing scheduler and green threads.
Some work has been done to wrap the Fork/Join API in the par branch, but it hasn't been merged into main (yet).
If you don't mind something slightly exotic (in exchange for some really noticeable speedup), you might also want to look into the work done by the author of the Penumbra-library, which provides easy access to the GPU.
How about using the
partition
function to break up yourrange
sequence? There was an interesting post on a similar problem at http://www.fatvat.co.uk/2009/05/jvisualvm-and-clojure.html