Don Stewart's Haskell in the Large's presentation mentioned Phantom Types:
data Ratio n = Ratio Double
1.234 :: Ratio D3
data Ask ccy = Ask Double
Ask 1.5123 :: Ask GBP
I read over his bullet points about them, but I did not understand them. In addition, I read the Haskell Wiki on the topic. Yet I still am missing their point.
What's the motivation to use a phantom type?
To answer the "what's the motivation to use a phantom type". There is two points:
For example you could have distances tagged by the length unit:
And you can avoid Mars Climate Orbiter disaster:
There are slight varitions to this "pattern". You can use
DataKinds
to have closed set of units:And it will work similarly:
But now the
Distance
can be only in kilometers or miles, we can't add more units later. That might be useful in some use cases.We could also do:
In the distance case we can work out the addition, for example translate to kilometers if different units are involved. But this doesn't work well for currencies which ratio isn't constant over time etc.
And it's possible to use GADTs for that instead, which may be simpler approach in some situations:
Now we know the unit also on the value level:
This approach especially greately simplifies
Expr a
example from Aadit's answer:It's worth pointing out that the latter variations require non-trivial language extensions (
GADTs
,DataKinds
,KindSignatures
), which might not be supported in your compiler. That's might be the case with Mu compiler Don mentions.The motivation behind using phantom types is to specialize the return type of data constructors. For example, consider:
The return type of both
Nil
andCons
isList a
by default (which is generalized for all lists of typea
).Also note that
Nil
is a phantom constructor (i.e. its return type doesn't depend upon its arguments, vacuously in this case, but nonetheless the same).Because
Nil
is a phantom constructor we can specializeNil
to any type we want (e.g.Nil :: List Int
orNil :: List Char
).Normal algebraic data types in Haskell allow you to choose the type of the arguments of a data constructor. For example, we chose the type of arguments for
Cons
above (a
andList a
).However, it doesn't allow you to choose the return type of a data constructor. The return type is always generalized. This is fine for most cases. However, there are exceptions. For example:
The type of the data constructors are:
As you can see, the return type of all the data constructors are generalized. This is problematic because we know that
Number
andIncrement
must always return anExpr Int
andBoolean
andNot
must always return anExpr Bool
.The return types of the data constructors are wrong because they are too general. For example,
Number
cannot possibly return anExpr a
but yet it does. This allows you to write wrong expressions which the type checker won't catch. For example:The problem is that we can't specify the return type of data constructors.
Notice that all the data constructors of
Expr
are phantom constructors (i.e. their return type doesn't depend upon their arguments). A data type whose constructors are all phantom constructors is called a phantom type.Remember that the return type of phantom constructors like
Nil
can be specialized to any type we want. Hence, we can create smart constructors forExpr
as follows:Now we can use the smart constructors instead of the normal constructors and our problem is solved:
So phantom constructors are useful when you want to specialize the return type of a data constructor and phantom types are data types whose constructors are all phantom constructors.
Note that data constructors like
Left
andRight
are also phantom constructors:The reason is that although the return type of these data constructors do depend upon their arguments yet they are still generalized because they only partially depend upon their arguments.
Simple way to know if a data constructor is a phantom constructor:
Hope that helps.
For
Ratio D3
specifically, we use rich types like that to drive type-directed code, so e.g. if you have a field somewhere at typeRatio D3
, its editor is dispatched to a text field accepting numeric entries only and showing a precision of 3 digits. This is in contrast, e.g., withnewtype Amount = Amount Double
where we don't show decimal digits, but use thousand commas and parse input like '10m' as '10,000,000'.In the underlying representation, both are still just
Double
s.