I made small C module to improve performance, but GHC doesn't inline foreign functions, and calls cost eliminates the acceleration.
For example, test.h
:
int inc (int x);
test.c
:
#include "test.h"
int inc(int x) {return x + 1;}
Test.hc
:
{-# LANGUAGE ForeignFunctionInterface #-}
module Test (inc) where
import Foreign
import Foreign.C
foreign import ccall unsafe "test.h inc" c_inc :: CInt -> CInt
inc = fromIntegral . c_inc . fromIntegral
{-# INLINE c_inc #-}
{-# INLINE inc #-}
Main.hs
:
import System.Environment
import Test
main = do {args <- getArgs; putStrLn . show . inc . read . head $ args }
Making:
$ gcc -O2 -c test.c
$ ghc -O3 test.o Test.hs
$ ghc --make -O3 test.o Main
$ objdump -d Main > Main.as
Finally, in Main.as
I have callq <inc>
instructions instead of desirable inc
's.
GHC won't inline C code via its asm backend or LLVM backend. Typically you're only going to call into C for performance reasons if the thing you are calling really costs a lot. Incrementing an int isn't such a thing, as we already have primops for that.
Now, if you call via C you may get GCC to inline things (check the generated assembly).
Now, however, there's some things you can do already to minimize the call cost:
foreign import ccall unsafe "test.h inc" c_inc :: CInt -> CInt
inc = fromIntegral . c_inc . fromIntegral
Provide a type signature for inc
. You're paying precious cycles converting to Integer here.
Mark the call as "unsafe", as you do, so that the runtime is not bookmarked prior to the call.
Measure the FFI call overhead - it should be in the nanoseconds. However, if you find it still too expensive, you can write a new primop and jump to it directly. But you better have your criterion numbers first.