Has anyone an idea how the type inference problem
E > hd (cons 1 nil) : α0
with the typing environment
E={
hd : list(α1 ) → α1 ,
cons : α2 → list(α2 ) → list(α2 ),
nil : list(α3 ),
1 : int
}
can be transferred in an unification problem?
Any help would really be appreciated!
First, rename type variables so that none of the variables in your expression collide with variables in the typing environment. (In your example, this is already done since the expression references { a0 }, and the typing environment references { a1, a2, a3 }.
Second, using new type variables, make a type variable for every subexpression within your expression, producing something like:
nil : a4
1 : a5
cons : a6
(cons 1 nil) : a7
hd : a8
hd (cons 1 nil) : a0 // already had a type variable
Third, generate a set of equations among type variables which must hold. You can do this from the bottom up, from the top down, or in other ways. See Heeren, Bastiaan. Top Quality Type Error Messages. 2005. for extensive details on why you might want to choose one way or another. This will result in something like:
a4 = list(a3) // = E(nil)
a5 = int // = E(1)
a6 = a2 -> list(a2) -> list(a2) // = E(cons)
// now it gets tricky, since we need to deal with application for the first time
a5 = a2 // first actual param of cons matches first formal param of cons
a4 = list(a2) // second actual param of cons matches type of second formal param of cons
a7 = list(a2) // result of (cons 1 nil) matches result type of cons with 2 params
a8 = list(a1) -> a1 // = E(hd)
// now the application of hd
a7 = list(a1) // first actual param of hd matches type of first formal param of hd
a0 = a1 // result of hd (cons 1 nil) matches result type of hd with 1 param
Note carefully that if the same function was used from the type environment twice, we would need more new type variables (in the second step, above) to unify with, so that we wouldn't accidentally make all calls to cons use the same type. I'm not sure how to explain this part more clearly, sorry. Here we are in the easy case since hd and cons are each only used once.
Fourth, unify these equations, resulting in (if I haven't made a mistake) something like:
a4 = list(int)
a5 = int
a6 = int -> list(int) -> list(int)
a7 = list(int)
a8 = list(int) -> int
a0 = int
Rejoice, you now know the type of every subexpression in your original expression.
(Fair warning, I'm somewhat of an amateur myself in these matters, and I may well have made a typographical error or inadvertently cheated somewhere here.)