What does Core Haskell applying types to functions

2019-02-18 16:56发布

问题:

I wrote a custom pretty printer for Core Haskell in order to better study Core's structure. The gist of this pretty printer is that it takes a CoreModule and includes data constructors in the output, which the default Outputable implementation does not seem to do.

Here is the code of the module that I am running the pretty printer on:

module Bar2 where

add :: Int -> Int -> Int
add a b = a + b

add2 a b = a + b

Here is the pretty printer output:

------------------------------- Module Metadata --------------------------------
Module { "main" :: modulePackageId, "Bar2" :: moduleName }
-------------------------------- Type Bindings ---------------------------------
[r0 :-> Identifier ‘add’, rjH :-> Identifier ‘add2’]
-------------------------------- Core Bindings ---------------------------------
NonRec (Id "add2")
       (Lam (TyVar "a")
            (Lam (Id "$dNum")
                 (Lam (Id "a1")
                      (Lam (Id "b")
                           (App (App (App (App (Var (Id "+"))
                                               (Type (TyVar (TyVar "a"))))
                                          (Var (Id "$dNum")))
                                     (Var (Id "a1")))
                                (Var (Id "b")))))))

NonRec (Id "add")
       (Lam (Id "a")
            (Lam (Id "b")
                 (App (App (App (App (Var (Id "+"))
                                     (Type (TyConApp (Int) [])))
                                (Var (Id "$fNumInt")))
                           (Var (Id "a")))
                      (Var (Id "b")))))
--------------------------------- Safe Haskell ---------------------------------
Safe
------------------------------------- End --------------------------------------

What is confusing to me is that in both instances, Core appears to be applying a type variable, or a type constructor to the + function, as well as some $dNum or $fNumInt before taking in the arguments.

For the add function, the type is also explicitly given, while the add2 is left up to compiler inference. This also seems to affect the number of arguments that the chain of lambda functions requires for evaluation, with add needing 2 while add2 requiring 4.

What does this all mean?

回答1:

Core is pretty much SystemF (technically SystemFC). In SystemF, type variables also need to be arguments to the function. In your example, Haskell infers that

add2 :: Num a => a -> a -> a 
add2 a b = a + b

That explains the TyVar "a" argument to add2.

Also, Haskell has to find a way to dispatch to the 'right' set of Num functions depending on what the type of the arguments a and b is. It does that by having a dictionary argument for each type class constraint. That's the Id $dNum argument. In the case of add, Haskell already knows which dictionary the appropriate (+) function can be found since it knows it knows the operation is on Int (so it doesn't need to be passed in: it's just $fNumInt).

Essentially what happens under the hood is that for each typeclass Haskell makes a record data $d<Class> = ... with fields that are the functions inside the typeclass. Then, for each instance, it makes another $f<Class><Type> :: $d<Class>. This is explained in more detail here

Here is another excellent answer describing Core related things.



回答2:

In GHC 8.x you can play with type arguments in Haskell as well, similarly to Core. Here's an example with some more annotations, based on the posted code.

add :: Int -> Int -> Int
add a b = (+) @ Int a b

The (+) @ Int specializes the polymorphic (+) operator so that it works on type Int.

In Core, you also see the typeclass dictionary being passed around $fNumInt.

add2 :: forall n. Num n => n -> n -> n    
add2 a b = (+) @ n a b

This is basically the same, except that n is not known.

In Core, add2 takes a hidden "type-valued" argument n (confusingly called a in the posted example, i.e. (Lam (TyVar "a") ...), which is then forwarded to (+) as a type argument. Since the dictionary is now unknown, in Core there is another hidden argument: the dictionary has to be passed by the caller of add2, which then forwards it to (+). This additional argument is called $dNum (See (Lam (Id "$dNum") ...).