How can I use and call Haskell functions with higher-order type signatures from C# (DLLImport), like...
double :: (Int -> Int) -> Int -> Int -- higher order function
typeClassFunc :: ... -> Maybe Int -- type classes
data MyData = Foo | Bar -- user data type
dataFunc :: ... -> MyData
What are the corresponding type signature in C#?
[DllImport ("libHSDLLTest")]
private static extern ??? foo( ??? );
Additionally (because it may be easier): How can I use "unknown" Haskell types within C#, so I can at least pass them around, without C# knowing any specific type? The most important functionality I need right know is to pass around a type class (like Monad or Arrow).
I already know how to compile a Haskell library to DLL and use within C#, but only for first-order functions. I'm also aware of Stackoverflow - Call a Haskell function in .NET, Why isn't GHC available for .NET and hs-dotnet, where I didn't find ANY documentation and samples (for the C# to Haskell direction).
I'll elaborate here on my comment on FUZxxl's post.
The examples you posted are all possible using FFI
. Once you export your functions using FFI you can as you've already figured out compile the program into a DLL.
.NET was designed with the intention of being able to interface easily with C, C++, COM, etc. This means that once you're able to compile your functions to a DLL, you can call it (relatively) easy from .NET. As I've mentioned before in my other post that you've linked to, keep in mind which calling convention you specify when exporting your functions. The standard in .NET is stdcall
, while (most) examples of Haskell FFI
export using ccall
.
So far the only limitation I've found on what can be exported by FFI is polymorphic types
, or types that are not fully applied. e.g. anything other than kind *
(You can't export Maybe
but you can export Maybe Int
for instance).
I've written a tool Hs2lib that would cover and export automatically any of the functions you have in your example. It also has the option of generating unsafe
C# code which makes it pretty much "plug and play". The reason I've choosen unsafe code is because it's easier to handle pointers with, which in turn makes it easier to do the marshalling for datastructures.
To be complete I'll detail how the tool handles your examples and how I plan on handling polymorphic types.
When exporting higher order functions, the function needs to be slightly changed. The higher-order arguments need to become elements of FunPtr. Basically They're treated as explicit function pointers (or delegates in c#), which is how higher orderedness is typically done in imperative languages.
Assuming we convert Int
into CInt
the type of double is transformed from
(Int -> Int) -> Int -> Int
into
FunPtr (CInt -> CInt) -> CInt -> IO CInt
These types are generated for a wrapper function (doubleA
in this case) which is exported instead of double
itself. The wrapper functions maps between the exported values and the expected input values for the original function. The IO is needed because constructing a FunPtr
is not a pure operation.
One thing to remember is that the only way to construct or dereference a FunPtr
is by statically creating imports which instruct GHC to create stubs for this.
foreign import stdcall "wrapper" mkFunPtr :: (Cint -> CInt) -> IO (FunPtr (CInt -> CInt))
foreign import stdcall "dynamic" dynFunPtr :: FunPtr (CInt -> CInt) -> CInt -> CInt
The "wrapper" function allows us to create a FunPtr
and the "dynamic" FunPtr
allows one to deference one.
In C# we declare the input as a IntPtr
and then use the Marshaller
helper function Marshal.GetDelegateForFunctionPointer to create a function pointer that we can call, or the inverse function to create a IntPtr
from a function pointer.
Also remember that the calling convention of the function being passed as an argument to the FunPtr must match the calling convention of the function to which the argument is being passed to. In other words, passing &foo
to bar
requires foo
and bar
to have the same calling convention.
Exporting a user datatype is actually quite straight forward. For every datatype that needs to be exported a Storable instance has to be created for this type. This instances specifies the marshalling information that GHC needs in order to be able to export/import this type. Among other things you would need to define the size
and alignment
of the type, along with how to read/write to a pointer the values of the type. I partially use Hsc2hs for this task (hence the C macros in the file).
newtypes
or datatypes
with just one constructor is easy. These become a flat struct since there's only one possible alternative when constructing/destructing these types. Types with multiple constructors become a union (a struct with Layout
attribute set to Explicit
in C#). However we also need to include an enum to identify which construct is being used.
in general, the datatype Single
defined as
data Single = Single { sint :: Int
, schar :: Char
}
creates the following Storable
instance
instance Storable Single where
sizeOf _ = 8
alignment _ = #alignment Single_t
poke ptr (Single a1 a2) = do
a1x <- toNative a1 :: IO CInt
(#poke Single_t, sint) ptr a1x
a2x <- toNative a2 :: IO CWchar
(#poke Single_t, schar) ptr a2x
peek ptr = do
a1' <- (#peek Single_t, sint) ptr :: IO CInt
a2' <- (#peek Single_t, schar) ptr :: IO CWchar
x1 <- fromNative a1' :: IO Int
x2 <- fromNative a2' :: IO Char
return $ Single x1 x2
and the C struct
typedef struct Single Single_t;
struct Single {
int sint;
wchar_t schar;
} ;
The function foo :: Int -> Single
would be exported as foo :: CInt -> Ptr Single
While a datatype with multiple constructor
data Multi = Demi { mints :: [Int]
, mstring :: String
}
| Semi { semi :: [Single]
}
generates the following C code:
enum ListMulti {cMultiDemi, cMultiSemi};
typedef struct Multi Multi_t;
typedef struct Demi Demi_t;
typedef struct Semi Semi_t;
struct Multi {
enum ListMulti tag;
union MultiUnion* elt;
} ;
struct Demi {
int* mints;
int mints_Size;
wchar_t* mstring;
} ;
struct Semi {
Single_t** semi;
int semi_Size;
} ;
union MultiUnion {
struct Demi var_Demi;
struct Semi var_Semi;
} ;
The Storable
instance is relatively straight forward and should follow easier from the C struct definition.
My dependency tracer would for emit for for the type Maybe Int
the dependency on both the type Int
and Maybe
. This means, that when generating the Storable
instance for Maybe Int
the head looks like
instance Storable Int => Storable (Maybe Int) where
That is, aslong as there's a Storable instance for the arguments of the application the type itself can also be exported.
Since Maybe a
is defined as having a polymorphic argument Just a
, when creating the structs, some type information is lost. The structs would contain a void*
argument, which you have to manually convert to the right type. The alternative was too cumbersome in my opinion, which was to create specialized structs aswell. E.g. struct MaybeInt. But the amount of specialized structures that could be generated from a normal module can quickly explode this way. (might add this as a flag later on).
To ease this loss of information my tool will export any Haddock
documentation found for the function as comments in the generated includes. It will also place the original Haskell type signature in the comment as well. An IDE would then present these as part of its Intellisense (code compeletion).
As with all of these examples I've ommited the code for the .NET side of things, If you're interested in that you can just view the output of Hs2lib.
There are a few other types that need special treatment. In particular Lists
and Tuples
.
- Lists need to get passed the size of the array from which to marshall from, since we're interfacing with unmanaged languages where the size of the arrays are not implicitly known. Conversly when we return a list, we also need to return the size of the list.
Tuples are special build in types, In order to export them, we have to first map them to a "normal" datatype, and export those. In the tool this is done up untill 8-tuples.
The problem with polymorphic types e.g. map :: (a -> b) -> [a] -> [b]
is that the size
of a
and b
are not know. That is, there's no way to reserve space for the arguments and return value since we don't know what they are. I plan to support this by allowing you to specify possible values for a
and b
and create specialized wrapper function for these types. On the other size, in the imperative language I would use overloading
to present the types you've chosen to the user.
As for classes, Haskell's open world assumption is usually a problem (e.g. an instance can be added any time). However at the time of compilation only a statically known list of instances is available. I intend to offer an option that would automatically export as much specialized instances as possible using these list. e.g. export (+)
exports a specialized function for all known Num
instances at compile time (e.g. Int
, Double
, etc).
The tool is also rather trusting. Since I can't really inspect the code for purity, I always trust that the programmer is honest. E.g. you don't pass a function that has side-effects to a function that expects a pure function. Be honest and mark the higher-ordered argument as being impure to avoid problems.
I hope this helps, and I hope this wasn't too long.
Update : There's somewhat of a big gotcha that I've recently discovered. We have to remember that the String type in .NET is immutable. So when the marshaller sends it to out Haskell code, the CWString we get there is a copy of the original. We have to free this. When GC is performed in C# it won't affect the the CWString, which is a copy.
The problem however is that when we free it in the Haskell code we can't use freeCWString. The pointer was not allocated with C (msvcrt.dll)'s alloc. There are three ways (that I know of) to solve this.
- use char* in your C# code instead of String when calling a Haskell function. You then have the pointer to free when you call returns, or initialize the function using fixed.
- import CoTaskMemFree in Haskell and free the pointer in Haskell
- use StringBuilder instead of String. I'm not entirely sure about this one, but the idea is that since StringBuilder is implemented as a native pointer, the Marshaller just passes this pointer to your Haskell code (which can also update it btw). When GC is performed after the call returns, the StringBuilder should be freed.