I wrote a custom pretty printer for Core Haskell in order to better study Core's structure. The gist of this pretty printer is that it takes a CoreModule and includes data constructors in the output, which the default Outputable
implementation does not seem to do.
Here is the code of the module that I am running the pretty printer on:
module Bar2 where
add :: Int -> Int -> Int
add a b = a + b
add2 a b = a + b
Here is the pretty printer output:
------------------------------- Module Metadata --------------------------------
Module { "main" :: modulePackageId, "Bar2" :: moduleName }
-------------------------------- Type Bindings ---------------------------------
[r0 :-> Identifier ‘add’, rjH :-> Identifier ‘add2’]
-------------------------------- Core Bindings ---------------------------------
NonRec (Id "add2")
(Lam (TyVar "a")
(Lam (Id "$dNum")
(Lam (Id "a1")
(Lam (Id "b")
(App (App (App (App (Var (Id "+"))
(Type (TyVar (TyVar "a"))))
(Var (Id "$dNum")))
(Var (Id "a1")))
(Var (Id "b")))))))
NonRec (Id "add")
(Lam (Id "a")
(Lam (Id "b")
(App (App (App (App (Var (Id "+"))
(Type (TyConApp (Int) [])))
(Var (Id "$fNumInt")))
(Var (Id "a")))
(Var (Id "b")))))
--------------------------------- Safe Haskell ---------------------------------
Safe
------------------------------------- End --------------------------------------
What is confusing to me is that in both instances, Core appears to be applying a type variable, or a type constructor to the +
function, as well as some $dNum
or $fNumInt
before taking in the arguments.
For the add
function, the type is also explicitly given, while the add2
is left up to compiler inference. This also seems to affect the number of arguments that the chain of lambda functions requires for evaluation, with add
needing 2 while add2
requiring 4.
What does this all mean?
In GHC 8.x you can play with type arguments in Haskell as well, similarly to Core. Here's an example with some more annotations, based on the posted code.
The
(+) @ Int
specializes the polymorphic(+)
operator so that it works on typeInt
.In Core, you also see the typeclass dictionary being passed around
$fNumInt
.This is basically the same, except that
n
is not known.In Core,
add2
takes a hidden "type-valued" argumentn
(confusingly calleda
in the posted example, i.e.(Lam (TyVar "a") ...
), which is then forwarded to(+)
as a type argument. Since the dictionary is now unknown, in Core there is another hidden argument: the dictionary has to be passed by the caller ofadd2
, which then forwards it to(+)
. This additional argument is called$dNum
(See(Lam (Id "$dNum") ...
).Core is pretty much SystemF (technically SystemFC). In SystemF, type variables also need to be arguments to the function. In your example, Haskell infers that
That explains the
TyVar "a"
argument toadd2
.Also, Haskell has to find a way to dispatch to the 'right' set of
Num
functions depending on what the type of the argumentsa
andb
is. It does that by having a dictionary argument for each type class constraint. That's theId $dNum
argument. In the case ofadd
, Haskell already knows which dictionary the appropriate(+)
function can be found since it knows it knows the operation is onInt
(so it doesn't need to be passed in: it's just$fNumInt
).Essentially what happens under the hood is that for each typeclass Haskell makes a record
data $d<Class> = ...
with fields that are the functions inside the typeclass. Then, for each instance, it makes another$f<Class><Type> :: $d<Class>
. This is explained in more detail hereHere is another excellent answer describing Core related things.