Is it possible to get GHC to produce SIMD code for the various SSE generations?
Eg. got a program like this
import Data.Array.Vector
main = print . sumU $ (enumFromToFracU 1 10000000 :: UArr Double)
I can see the generated code (compiled for 64 bit x86) use SSE instructions in scalar mode (both C and asm backends). So addsd rather than addpd. For the types of programs I work on the use of vector instructions is important for performance. Is there an easy way for a newbie such as myself to get GHC to SIMDize the code using SSE?
Yes, it is possible, via the C backend, but it is trial and error. The flags I use:
Then hope GCC spots the tight loop GHC generates via the uvector code, and realises there is SIMD potential.