One one hand, in Haskell Vector a
seems to be the preferred type to use as an array of numbers. There is even an (incomplete) Vector Tutorial.
On the other hand, Control.Parallel.Strategies
are defined mostly in terms of Traversable
. Vector library doesn't provide these instances.
The minimal complete definition of Traversable t
should also define Foldable
and
traverse :: Applicative f => (a -> f b) -> t a -> f (t b)
sequenceA :: Applicative f => t (f a) -> f (t a)
I don't see how sequenceA
can be defined for Data.Vector.Unboxed.Vector
. So, what is the best approach to writing parallel code with unboxed vectors? Defining some new ad hoc strategies like evalVector
or using par
and pseq
explicitly or using plain Data.Array
instead of vectors?
P.S. Plain Array
s are parallelizable without problems: https://gist.github.com/701888
It's a hack job for parVector
but this worked for me:
import qualified Data.Vector as V
import Control.Parallel.Strategies
import Control.Parallel
import Control.DeepSeq
ack :: Int -> Int -> Int
ack 0 n = n+1
ack m 0 = ack (m-1) 1
ack m n = ack (m-1) (ack m (n-1))
main = do
let vec = V.enumFromN 1 1000
let res = (V.map (ack 2) vec) `using` parVector
print res
parVector :: NFData a => Strategy (V.Vector a)
parVector vec = eval vec `seq` Done vec
where
chunkSize = 1
eval v
| vLen == 0 = ()
| vLen <= chunkSize = rnf (v V.! 0) -- FIX this to handle chunks > 1
| otherwise = eval (V.take half v) `par` eval (V.drop half v)
where vLen = V.length v
half = vLen `div` 2
And running this code:
[tommd@Mavlo Test]$ ghc --make -O2 -threaded t.hs
... dumb warning ...
[tommd@Mavlo Test]$ time ./t +RTS -N1 >/dev/null
real 0m1.962s user 0m1.951s sys 0m0.009s
[tommd@Mavlo Test]$ time ./t +RTS -N2 >/dev/null
real 0m1.119s user 0m2.221s sys 0m0.005s
When I run the code with Integer
instead of Int
in the type signature:
[tommd@Mavlo Test]$ time ./t +RTS -N2 >/dev/null
real 0m4.754s
user 0m9.435s
sys 0m0.028s
[tommd@Mavlo Test]$ time ./t +RTS -N1 >/dev/null
real 0m9.008s
user 0m8.952s
sys 0m0.029s
Rock!
EDIT: And a solution that is a bit closer to your earlier attempt is cleaner (it doesn't use functions from three separate modules) and works great:
parVector :: NFData a => Strategy (V.Vector a)
parVector vec =
let vLen = V.length vec
half = vLen `div` 2
minChunk = 10
in if vLen > minChunk
then do
let v1 = V.unsafeSlice 0 half vec
v2 = V.unsafeSlice half (vLen - half) vec
parVector v1
parVector v2
return vec
else
evalChunk (vLen-1) >>
return vec
where
evalChunk 0 = rpar (rdeepseq (vec V.! 0)) >> return vec
evalChunk i = rpar (rdeepseq (vec V.! i)) >> evalChunk (i-1)
Things to learn from this solution:
- It uses the
Eval
monad, which is strict so we're sure to spark everything (compared to wrapping things in let
and remembering to use bang patterns).
- Contrary to your proposed implementation it (a) doesn't construct a new vector, which is costly (b)
evalChunk
forces evaluation of each element using rpar
and rdeepseq
(I don't believe rpar vec
forces any of the vector's elements).
- Contrary to my belief,
slice
takes a start index and length, not a start and end index. Oops!
- We still need to import
Control.DeepSeq (NFData)
, but I've e-mailed the libraries list to try and fix that issue.
Performance seems similar to the first parVector
solution in this answer, so I won't post numbers.
1) As you probably know, vector
is a product of the DPH work that has proven harder than the researchers initially expected.
2) Unboxed vectors can't divide up the work for individual elements across multiple CPUs.
3) I'd be a lot more hopeful for boxed vectors. Something like:
using (map (rnf . (vec !)) [0..V.length vec - 1]) (parList rdeepseq)
Or maybe you can avoid constructing the list and using parlist. I think just assigning parts of the array is sufficient. The below code is likely broken, but the concept of making your own parVector
using rnf
and dividing the vector in half until it is a single element (or some tunable chunk size of elements) should work.
parVector :: Strategy (Vector a)
parVector = let !_ = eval vec in Done vec
where
chunkSize = 1
eval v
| vLen == 0 = ()
| vLen <= chunkSize = rnf (v ! 0) -- FIX this to handle chunks > 1
| otherwise = eval (V.take half v) `par` eval (V.drop half v)
where vLen = V.length v
half = vLen `div` 2