I'm curious about the 'undefined' value in Haskell. Its interesting because you can put it just about anywhere, and Haskell will be happy. The following are all a-ok
[1.0, 2.0, 3.0 , undefined] ::[Float]
[1, 2 ,3 undefined, 102312] :: [Int]
("CATS!", undefined) :: (String, String)
....and many more
How does undefined work under the hood? What makes it possible to have data that is of every data type? Would it be possible for me to define a value like this that I can put everywhere, or is this some special case?
There are two separated things to note about undefined:
For the second, GHC commentary clearly says:
For more details you might want to read the paper Spinless Tagless G-Machine.
There's nothing really special about
undefined
. It is just an exceptional value -- you could represent it with an infinite loop, or a crash, or a segfault. One way to write it is as a crash:Or a loop:
It is an exceptional value that can be an element of any type.
Since Haskell is lazy you can still use such values in computations. E.g.
But otherwise, it is just a natural consequence of polymorphism and non-strictness.
Well, basically
undefined = undefined
— if you try evaluate it, you get an endless loop. But Haskell is a non-strict language, sohead [1.0, 2.0, undefined]
doesn't evaluate all elements of the list, so it prints1.0
and terminates. But if you printshow [1.0, 2.0, undefined]
, you'll see[1.0,2.0,*** Exception: Prelude.undefined
.As of how it can be of all types... Well, if expression is of type
A
, it means that evaluating it either will yield value of typeA
, or the evaluation will diverge, yielding no value at all. Now,undefined
always diverge, so it fits into definition for every imaginable typeA
.Also, a good blog post on the related topics: http://james-iry.blogspot.ru/2009/08/getting-to-bottom-of-nothing-at-all.html
The interesting property you're examining is that
undefined
has the typea
for any typea
we choose, i.e.undefined :: a
with no constraints. As others have noted,undefined
might be thought of as an error or infinite loop. I'd like to argue that it's better thought of as "the vacuously true statement". It's an almost unavoidable hole in any type system closely related to the Halting Problem, but it's fun to think about it from the point of view of logic.One way to think about programming with types is that it's a puzzle. Someone gives you a type and asks you to implement a function which is of that type. For instance
is an easy one. We need to produce an
a
for any typea
, but we're given one as an input so we can just return that.With
undefined
, this game is always easyso let's be rid of it. Making sense of
undefined
is easiest when you live in a world without it. Now our game gets harder. Sometimes it's possibleBut suddenly it's sometimes not possible as well!
Or is it?
Which is legal because Haskell's
let
binding is lazy and (generally) recursive. Now we're golden.Even without having
undefined
built-in we can rebuild it in our game using any old infinite loop. The types check out. To truly prevent ourselves from havingundefined
in Haskell we'd need to solve the Halting Problem.Now, as you've seen,
undefined
is useful for debugging since it can be a placeholder for any value. It's unfortunately terrible for operations since it either leads to an infinite loop or an immediate crash. It's also really bad for our game because it lets us cheat. Finally, I hope I've demonstrated that it's pretty hard to not haveundefined
so long as your language has (potentially infinite) loops.There exist languages like Agda and Coq which trade away loops in order to truly eliminate
undefined
. They do this because this game I've invented can, in some cases, actually be very valuable. It can encode statements of logic and thus be used to form very, very rigorous mathematical proofs. Your types represent theorems and your programs are assurances that that theorem is substantiated. The existence ofundefined
would mean that all theorems are substantiatable and thus make the whole system untrustworthy.But in Haskell, we're more interested in looping than proofing, so we'd rather have
fix
than be certain there wasn't anundefined
.If I try
undefined
in GHCi, I get an exception:Therefore, I suppose that it's just implemented as throwing an exception:
http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Exception.html#g:2
How does
undefined
work? Well, the best answer, IMHO, is that it doesn't work. But to understand that answer, we have to work through its consequences, which are not obvious to a newcomer.Basically, if we have
undefined :: a
, what that means for the type system is thatundefined
can appear anywhere. Why? Because in Haskell, whenever you see an expression that has some type, you can specialize the type by consistently substituting all instances of any of its type variables for any other type. Familiar examples would be things like this:In the case of
map
, how does this work? Well, it comes down to the fact thatmap
's arguments provide everything that's necessary to produce an answer, no matter what substitutions and specializations we make for its type variables. If you have a list and a function that consumes values of the same type as the list's elements, you can do what map does, period.But in the case of
undefined :: a
, what this signature would mean is that no matter what typea
may get specialized to,undefined
is able to produce a value of that type. How can it do it? Well, actually, it can't, so if a program actually reaches a step where the value ofundefined
is needed, there is no way to continue. The only thing the program can do at that point is to failThe story behind this other case is similar but different:
Here, we can prove that
loop
has typea
by this crazy-sounding argument: suppose thatloop
has typea
. It needs to produce a value of typea
. How can it do it? Easy, it just callsloop
. Presto!That sounds crazy, right? Well, the thing is that it's really no different from what's going on in the second equation of this definition of
map
:In that second equation,
f x
has typeb
, and(f x:)
has type[b] -> [b]
; now to conclude our proof thatmap
indeed has the type our signature claims, we need to produce a[b]
. So how are we doing it? By assuming thatmap
has the type we're trying to prove it has!The way Haskell's type inference algorithm works is that it first guesses that the expression's type is
a
, and then it only changes its guess when it finds something that contradicts that assumption.undefined
typechecks toa
because it's a flat-out lie.loop
typechecks toa
because recursion is allowed, and allloop
does is recurse.EDIT: What the heck, I might as well spell out one example. Here's an informal demonstration of how to infer the type of
map
from this definition:It goes like this:
map :: a
.a
can't be the type. We revise our assumption to this:map :: a -> b -> c; f :: a
.map :: a -> [b] -> c; f :: a
.map :: a -> [b] -> [c]; f :: a
.(:) :: b -> [b] -> [b]
. This means that in that equation,x :: b
andxs :: [b]
.map f (x:xs)
must be of type[c]
, that meansf x : map f xs
must also be of type[c]
.(:) :: c -> [c] -> [c]
, that means thatf x :: c
andmap f xs :: [c]
.map f xs :: [c]
. We had assumed that in (6), and if we had concluded otherwise in (7) this would have been a type error. We can also now dive into this expression and see what types this requiresf
andxs
to have, but to make a longer story short, everything's going to check out.f x :: c
andx :: b
, we must conclude thatf :: b -> c
. So now we getmap :: (b -> c) -> [b] -> [c]
.The same process, but for
loop = loop
:loop :: a
.loop
takes no arguments, so its type is consistent witha
so far.loop
isloop
, which we've provisionally assigned typea
, so that checks out.loop
has typea
.