Let's say that we have a store management application. It has Customer
s and can chargeFee()
. It should do so only for active Customer
s however.
A common way I've seen this done (Java/pseudocode) is something like this:
class Customer {
String name
StatusEnum status // 1=active, 2=inactive
}
// and this is how the customers are charged
for (c:Customer.listByStatus(StatusEnum.1)) {
c.chargeFee()
}
This is OK, but it doesn't stop someone from charging a fee from an inactive Customer
. Even if chargeFee()
checks the status of the Customer
, that's a runtime error/event.
So, keeping the whole 'make illegal states unrepresentable' thing in mind, how would one approach design of this application (in Haskell for example)? I want a compile error if someone tries to charge an inactive customer.
I was thinking something like this, but I still doesn't allow me to restrict chargeFee
so that an inactive Customer
cannot be charged.
data CustomerDetails = CustomerDetails { name :: String }
data Customer a = Active a | Inactive a
chargeFee :: Active a -> Int -- this doesn't work, do I need DataKinds?
You can accomplish such a thing with phantom types:
Here
activate
will somehow ensure that the given customer can be made active (and do so), producing said active customer. But trying to callchargeFee (mkCustomer ...)
is a type error.Note that
DataKinds
are not strictly required - the following is equivalent:The same can be accomplished without phantom types, by simply declaring two types -
ActiveCustomer
andInactiveCustomer
- but the phantom types approach allows you to write functions which don't care about the type of customer:You could always make
chargeFee
return aMaybe
orEither
for illegal actions:A basic way is to use a separate type
This can also be done, more or less, in OOP languages: just use a different class for active and inactive customers, possibly inheriting from a common
Customer
interface / superclass.With algebraic types you get the benefits of the closed-world assumption, namely that there are no other subtypes of
Customer
, but often one can live without that.A more advanced way is to use a GADT.
DataKinds
is optional but is nicer, IMHO. (Warning: untested)Alternatively, factor out the tag with a singleton:
All that's required is to tag the type with the active status. I see no need for separate constructors. It's easy to do so like so:
(p.s. I've added an
Int
to your data type to represent credit, so you can actually charge the customer in some way.)So
Customer Active
represents and "active" customer, and likewiseCustomer Inactive
represents an "inactive" customer.We can then "create" customers like so:
Creating convenience methods is easy:
Note that using
create
directly you can create silly types likeCustomer Int
. You've got a few options to stop this,a
increate
type classes.I'll go through option 2 later on.
Now we can write some methods to work on our type:
Note that
chargeCustomer
only works on active customers. You'l get a type error otherwise.Now I'm going to write a utility function,
castCustomer
.What
castCustomer
does is just change any sort of customer into any sort of customer. Think of this as an unsafe cast in C, you shouldn't expose this to your users. But it's useful to write your other functions:So you can do
setActiveStatus Inactive customer
and you'll get backcustomer
but inactive. It just usescastCustomer
which works for all casts, butsetActiveStatus
's own type restrictscastCustomer
appropriately.And there's also these simpler utility functions:
Of course one can now write convenience functions:
Finally, one might want a function like this:
where we pass a status and a customer, and get that customer returned if they match the status, but otherwise returns
Nothing
.We're going to need different implementations depending on the types, so we're going to need a class.
We could write a class like
class GetByStatus a b
but the problem is that any functions which use this class will have to have an uglyGetByStatus a b
in their type signatures constraint clause.So we're going to make a simpler class:
Which is going to have two instances:
Here's the definitions of the
LegalStatus
class:This may look confusing, but lets have a look at the instances:
What we're doing here is an object orientated technique called
https://en.wikipedia.org/wiki/Double_dispatch
. This means we don't complicate our signatures. We can now out function like:Using these functions and
catMaybe
, it's relatively easy to write functions that say, take a list of customers and return only the active ones:It's worth pointing out how magic
getAll
is (and indeed, many other similar functions in Haskell). DogetAll list
and if you put in into a list of active customers, you'll get only the active customers in the list, and similarly, if you put it into a list of inactive customers, you'll only get inactive customers in the list.I'll illstrate this through the following function, which splits a list of customers who's status is unknown into a list of active customers and a list of inactive customers:
Looking at the implementation of
splitCustomers
, it looks like the first and second elements of the pair are the same. Indeed they look exactly the same. But they're not, they've got different types, and as a result end up calling different instances and can get totally different results.There's one other thing to close up if you really want to. You'll probably want to expose the class
LegalStatus
, as users might want to use it as a constraint in their type signatures, but that means they can write instances ofLegalStatus
. LikeThey'd be silly to do this but you can stop them if you like. The simplest approach is this:
Any attempt to make a new instance will now fail the
IsLegalStatus
constraint and fail.This is probably overengineered at this point, and you won't need all this, but I've included it to show some points about type-inference:
So for your reference, here's all the code attached below:
Edit:
Others have pointed out to restrict the statuses using
DataKinds
. Admittedly this is probably a cleaner approach than my "constraint on the class" approach. Note you have to change a few of the functions, because the parameter to the class is no longer an ordinary type but a kind, and only ordinary types can be parameters to functions, you have to wrap raw status functions in aProxy
constructor.Note with the DataKind approach you can no longer call
getByStatus Active
... becauseActive
is no longer a value, you need to do:but feel free to define:
so you can then call:
The full code is below.