I'm reading Learn You a Haskell, and in the monad chapters, it seems to me that ()
is being treated as a sort of "null" for every type. When I check the type of ()
in GHCi, I get
>> :t ()
() :: ()
which is an extremely confusing statement. It seems that ()
is a type all to itself. I'm confused as to how it fits into the language, and how it seems to be able to stand for any type.
tl;dr
()
does not add a "null" value to every type, hell no;()
is a "dull" value in a type of its own:()
.Let me step back from the question a moment and address a common source of confusion. A key thing to absorb when learning Haskell is the distinction between its expression language and its type language. You're probably aware that the two are kept separate. But that allows the same symbol to be used in both, and that is what is going on here. There are simple textual cues to tell you which language you're looking at. You don't need to parse the whole language to detect these cues.
The top level of a Haskell module lives, by default, in the expression language. You define functions by writing equations between expressions. But when you see foo :: bar in the expression language, it means that foo is an expression and bar is its type. So when you read
() :: ()
, you're seeing a statement which relates the()
in the expression language with the()
in the type language. The two()
symbols mean different things, because they are not in the same language. This repetition often causes confusion for beginners, until the expression/type language separation installs itself in their subconscious, at which point it becomes helpfully mnemonic.The keyword
data
introduces a new datatype declaration, involving a careful mixture of the expression and type languages, as it says first what the new type is, and secondly what its values are.In such a declaration, type constructor TyCon is being added to the type language and the ValCon value constructors are being added to the expression language (and its pattern sublanguage). In a
data
declaration, the things which stand in argument places for the ValCon s tell you the types given to the arguments when that ValCon is used in expressions. For example,declares a type constructor
Tree
for binary tree types storing elements at nodes, whose values are given by value constructorsLeaf
andNode
. I like to colour type constructors (Tree) blue and value constructors (Leaf, Node) red. There should be no blue in expressions and (unless you're using advanced features) no red in types. The built-in typeBool
could be declared,adding blue
Bool
to the type language, and redTrue
andFalse
to the expression language. Sadly, my markdown-fu is inadequate to the task of adding the colours to this post, so you'll just have to learn to add the colours in your head.The "unit" type uses
()
as a special symbol, but it works as if declaredmeaning that a notionally blue
()
is a type constructor in the type language, but that a notionally red()
is a value constructor in the expression language, and indeed() :: ()
. [It is not the only example of such a pun. The types of larger tuples follow the same pattern: pair syntax is as if given byadding (,) to both type and expression languages. But I digress.
So the type
()
, often pronounced "Unit", is a type containing one value worth speaking of: that value is written()
but in the expression language, and is sometimes pronounced "void". A type with only one value is not very interesting. A value of type()
contributes zero bits of information: you already know what it must be. So, while there is nothing special about type()
to indicate side effects, it often shows up as the value component in a monadic type. Monadic operations tend to have types which look likewhere the return type is a type application: the function tells you which effects are possible and the argument tells you what sort of value is produced by the operation. For example
which is read (because application associates to the left ["as we all did in the sixties", Roger Hindley]) as
has one value input type
s
, the effect-monadState s
, and the value output type()
. When you see()
as a value output type, that just means "this operation is used only for its effect; the value delivered is uninteresting". Similarlydelivers a string to
stdout
but does not return anything exciting.The
()
type is also useful as an element type for container-like structures, where it indicates that the data consists just of a shape, with no interesting payload. For example, ifTree
is declared as above, thenTree ()
is the type of binary tree shapes, storing nothing of interest at nodes. Similarly[()]
is the type of lists of dull elements, and if there is nothing of interest in a list's elements, then the only information it contributes is its length.To sum up,
()
is a type. Its one value,()
, happens to have the same name, but that's ok because the type and expression languages are separate. It's useful to have a type representing "no information" because, in context (e.g., of a monad or a container), it tells you that only the context is interesting.I really like to think of
()
by analogy with tuples.(Int, Char)
is the type of all pairs of anInt
and aChar
, so it's values are all possible values ofInt
crossed with all possible values ofChar
.(Int, Char, String)
is similarly the type of all triples of anInt
, aChar
, and aString
.It's easy to see how to keep extending this pattern upwards, but what about downwards?
(Int)
would be the "1-tuple" type, consisting of all possible values ofInt
. But that would be parsed by Haskell as just putting parentheses aroundInt
, and thus being just the typeInt
. And values in this type would be(1)
,(2)
,(3)
, etc, which also would just get parsed as ordinaryInt
values in parentheses. But if you think about it, a "1-tuple" is exactly the same as just a single value, so there's no need to actually have them exist.Going down one step further to zero-tuples gives us
()
, which should be all possible combinations of values in an empty list of types. Well, there's exactly one way to do that, which is to contain no other values, so there should be only one value in the type()
. And by analogy with tuple value syntax, we can write that value as()
, which certainly looks like a tuple containing no values.That's exactly how it works. There is no magic, and this type
()
and its value()
are in no way treated specially by the language.()
is not in fact being treated as "a null value for any type" in the monads examples in the LYAH book. Whenever the type()
is used the only value which can be returned is()
. So it's used as a type to explicitly say that there cannot be any other return value. And likewise where another type is supposed to be returned, you cannot return()
.The thing to keep in mind is that when a bunch of monadic computations are composed together with
do
blocks or operators like>>=
,>>
, etc, they'll be building a value of typem a
for some monadm
. That choice ofm
has to stay the same throughout the component parts (there's no way to compose aMaybe Int
with anIO Int
in that way), but thea
can and very often is different at each stage.So when someone sticks an
IO ()
in the middle of anIO String
computation, that's not using the()
as a null in theString
type, it's simply using anIO ()
on the way to building anIO String
, the same way you could use anInt
on the way to building aString
.Yet another angle:
()
is the name of a set which contains a single element called()
.Its indeed slightly confusing that the name of the set and the element in it happens to be the same in this case.
Remember: in Haskell a type is a set that has its possible values as elements in it.
The confusion comes from other programming languages: "void" means in most imperative languages that there is no structure in memory storing a value. It seems inconsistent because "boolean" has 2 values instead of 2 bits, while "void" has no bits instead of no values, but there it is about what a function returns in a practical sense. To be exact: its single value consumes no bit of storage.
Let's ignore the value bottom (written
_|_
) for a moment...()
is called Unit, written like a null-tuple. It has only one value. And it is not calledVoid
, becauseVoid
has not even any value, thus could not be returned by any function.Observe this:
Bool
has 2 values (True
andFalse
),()
has one value (()
), andVoid
has no value (it doesn't exist). They are like sets with two/one/no elements. The least memory they need to store their value is 1 bit / no bit / impossible, respectively. Which means that a function that returns a()
may return with a result value (the obvious one) that may be useless to you.Void
on the other hand would imply that that function will never return and never give you any result, because there would not exist any result.If you want to give "that value" a name, that a function returns which never returns (yes, this sounds like crazytalk), then call it bottom ("
_|_
", written like a reversed T). It could represent an exception or infinity loop or deadlock or "just wait longer". (Some functions will only then return bottom, iff one of their parameters is bottom.)When you create the cartesian product / a tuple of these types, you will observe the same behaviour:
(Bool,Bool,Bool,(),())
has 2·2·2·1·1=6 differnt values.(Bool,Bool,Bool,(),Void)
is like the set {t,f}×{t,f}×{t,f}×{u}×{} which has 2·2·2·1·0=0 elements, unless you count_|_
as a value.The
()
type can be thought of as a zero-element tuple. It's a type that can only have one value, and thus it's used where you need to have a type, but you don't actually need to convey any information. Here's a couple of uses for this.Monadic things like
IO
andState
have a return value, as well as performing side-effects. Sometimes the only point of the operation is to perform a side-effect, like writing to the screen or storing some state. For writing to the screen,putStrLn
must have typeString -> IO ?
--IO
always has to have some return type, but here there's nothing useful to return. So what type should we return? We could say Int, and always return 0, but that's misleading. So we return()
, the type that has only one value (and thus no useful information), to indicate that there's nothing useful coming back.It's sometimes useful to have a type which can have no useful values. Consider if you'd implemented a type
Map k v
which maps keys of typek
to values of typev
. Then you want to implement aSet
, which is really similar to a map except that you don't need the value part, just the keys. In a language like Java you might use booleans as the dummy value type, but really you just want a type that has no useful values. So you could saytype Set k = Map k ()
It should be noted that
()
is not particularly magic. If you want you can store it in a variable and do a pattern match on it (although there's not much point):It is called the
Unit
type, usually used to represent side effects. You can think of it vaguely asVoid
in Java. Read more here and here etc. What can be confusing is that()
syntactically represents both the type and its only value literal. Also note that it is not similar tonull
in Java which means an undefined reference -()
is just effectively a 0-sized tuple.