Is there any good resource on real use of Generalized Algebraic Data Type?
The example given in the haskell wikibook is too short to give me an insight of the real possibilities of GADT.
Thanks
Is there any good resource on real use of Generalized Algebraic Data Type?
The example given in the haskell wikibook is too short to give me an insight of the real possibilities of GADT.
Thanks
GADTs can give you stronger type enforced guarantees than regular ADTs. For example, you can force a binary tree to be balanced on the type system level, like in this implementation of 2-3 trees:
Each node has a type-encoded depth where all its leaves reside. A tree is then either an empty tree, a singleton value, or a node of unspecified depth, again using GADTs.
The type system guarantees you that only balanced nodes can be constructed. This means that when implementing operations like
insert
on such trees, your code type-checks only if its result is always a balanced tree.This is a short answer, but consult the Haskell Wikibook. It walks you though a GADT for a well-typed expression tree, which is a fairly canonical example: http://en.wikibooks.org/wiki/Haskell/GADT
GADTs are also used for implementing type equality: http://hackage.haskell.org/package/type-equality. I can't find the right paper to reference for this offhand -- this technique has made its way well into folklore by now. It is used quite well, however, in Oleg's typed tagless stuff. See, e.g. the section on typed compilation into GADTs. http://okmij.org/ftp/tagless-final/#tc-GADT
I like the example in the GHC manual. It's a quick demo of a core GADT idea: that you can embed the type system of a language you're manipulating into Haskell's type system. This lets your Haskell functions assume, and forces them to preserve, that the syntax trees correspond to well-typed programs.
When we define
Term
, it doesn't matter what types we choose. We could writeor
and the definition of
Term
would still go through.It's only once we want to compute on
Term
, such as in definingeval
, that the types matter. We need to havebecause we need our recursive call to
eval
to return anInt
, and we want to in turn return aBool
.I have found the "Prompt" monad (from the "MonadPrompt" package) a very useful tool in several places (along with the equivalent "Program" monad from the "operational" package. Combined with GADTs (which is how it was intended to be used), it allows you to make embedded languages very cheaply and very flexibly. There was a pretty good article in the Monad Reader issue 15 called "Adventures in Three Monads" that had a good introduction to the Prompt monad along with some realistic GADTs.
GADTs are weak approximations of inductive families from dependently typed languages—so let's begin there instead.
Inductive families are the core datatype introduction method in a dependently typed language. For instance, in Agda you define the natural numbers like this
which isn't very fancy, it's essentially just the same thing as the Haskell definition
and indeed in GADT syntax the Haskell form is even more similar
So, at first blush you might think GADTs are just neat extra syntax. That's just the very tip of the iceberg though.
Agda has capacity to represent all kinds of types unfamiliar and strange to a Haskell programmer. A simple one is the type of finite sets. This type is written like
Fin 3
and represents the set of numbers{0, 1, 2}
. Likewise,Fin 5
represents the set of numbers{0,1,2,3,4}
.This should be quite bizarre at this point. First, we're referring to a type which has a regular number as a "type" parameter. Second, it's not clear what it means for
Fin n
to represent the set{0,1...n}
. In real Agda we'd do something more powerful, but it suffices to say that we can define acontains
functionNow this is strange again because the "natural" definition of
contains
would be something likei < n
, butn
is a value that only exists in the typeFin n
and we shouldn't be able to cross that divide so easily. While it turns out that the definition is not nearly so straightforward, this is exactly the power that inductive families have in dependently typed languages—they introduce values that depend on their types and types that depend on their values.We can examine what it is about
Fin
that gives it that property by looking at its definition.this takes a little work to understand, so as an example lets try constructing a value of the type
Fin 2
. There are a few ways to do this (in fact, we'll find that there are exactly 2)This lets us see that there are two inhabitants and also demonstrates a little bit of how type computation happens. In particular, the
(n : Nat)
bit in the type ofzerof
reflects the actual valuen
up into the type allowing us to formFin (n+1)
for anyn : Nat
. After that we use repeated applications ofsuccf
to increment ourFin
values up into the correct type family index (natural number that indexes theFin
).What provides these abilities? In all honesty there are many differences in between a dependently typed inductive family and a regular Haskell ADT, but we can focus on the exact one that is most relevant to understanding GADTs.
In GADTs and inductive families you get an opportunity to specify the exact type of your constructors. This might be boring
Or, if we have a more flexible, indexed type we can choose different, more interesting return types
In particular, we're abusing the ability to modify the return type based on the particular value constructor used. This allows us to reflect some value information up into the type and produce more finely specified (fibered) typed.
So what can we do with them? Well, with a little bit of elbow grease we can produce
Fin
in Haskell. Succinctly it requires that we define a notion of naturals in types... then a GADT to reflect values up into those types...
... then we can use these to build
Fin
much like we did in Agda...And finally we can construct exactly two values of
Fin (S (S Z))
But notice that we've lost a lot of convenience over the inductive families. For instance, we can't use regular numeric literals in our types (though that's technically just a trick in Agda anyway), we need to create a separate "type nat" and "value nat" and use the GADT to link them together, and we'd also find, in time, that while type level mathematics is painful in Agda it can be done. In Haskell it's incredibly painful and often cannot.
For instance, it's possible to define a
weaken
notion in Agda'sFin
typewhere we provide a very interesting first value, a proof that
n <= m
which allows us to embed "a value less thann
" into the set of "values less thanm
". We can do the same in Haskell, technically, but it requires heavy abuse of type class prolog.So, GADTs are a resemblance of inductive families in dependently typed languages that are weaker and clumsier. Why do we want them in Haskell in the first place?
Basically because not all type invariants require the full power of inductive families to express and GADTs pick a particular compromise between expressiveness, implementability in Haskell, and type inference.
Some examples of useful GADTs expressions are Red-Black Trees which cannot have the Red-Black property invalidated or simply-typed lambda calculus embedded as HOAS piggy-backing off the Haskell type system.
In practice, you also often see GADTs use for their implicit existential context. For instance, the type
implicitly hides the
a
type variable using existential quantificationin a way that is sometimes convenient. If you look carefully the HOAS example from Wikipedia uses this for the
a
type parameter in theApp
constructor. To express that statement without GADTs would be a mess of existential contexts, but the GADT syntax makes it natural.