Let's consider a data type with many constructors:
data T = Alpha Int | Beta Int | Gamma Int Int | Delta Int
I want to write a function to check if two values are produced with the same constructor:
sameK (Alpha _) (Alpha _) = True
sameK (Beta _) (Beta _) = True
sameK (Gamma _ _) (Gamma _ _) = True
sameK _ _ = False
Maintaining sameK
is not much fun, it cannot be checked for correctness easily. For example, when new constructors are added to T
, it's easy to forget to update sameK
. I omitted one line to give an example:
-- it’s easy to forget:
-- sameK (Delta _) (Delta _) = True
The question is how to avoid boilerplate in sameK
? Or how to make sure it checks for all T
constructors?
The workaround I found is to use separate data types for each of the constructors, deriving Data.Typeable
, and declaring a common type class, but I don't like this solution, because it is much less readable and otherwise just a simple algebraic type works for me:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Typeable
class Tlike t where
value :: t -> t
value = id
data Alpha = Alpha Int deriving Typeable
data Beta = Beta Int deriving Typeable
data Gamma = Gamma Int Int deriving Typeable
data Delta = Delta Int deriving Typeable
instance Tlike Alpha
instance Tlike Beta
instance Tlike Gamma
instance Tlike Delta
sameK :: (Tlike t, Typeable t, Tlike t', Typeable t') => t -> t' -> Bool
sameK a b = typeOf a == typeOf b
Look at the Data.Data module, the toConstr
function in particular. Along with {-# LANGUAGE DeriveDataTypeable #-}
that will get you a 1-line solution which works for any type which is an instance of Data.Data
. You don't need to figure out all of SYB!
If, for some reason (stuck with Hugs?), that is not an option, then here is a very ugly and very slow hack. It works only if your datatype is Show
able (e.g. by using deriving (Show)
- which means no function types inside, for example).
constrT :: T -> String
constrT = head . words . show
sameK x y = constrT x == constrT y
constrT
gets the string representation of the outermost constructor of a T
value by showing it, chopping it up into words and then getting the first. I give an explicit type signature so you're not tempted to use it on other types (and to evade the monomorphism restriction).
Some notable disadvantages:
- This breaks horribly when your type has infix constructors (such as
data T2 = Eta Int | T2 :^: T2
)
- If some of your constructors have a shared prefix, this is going to get slower, as a larger part of the strings has to be compared.
- Doesn't work on types with a custom
show
, such as many library types.
That said, it is Haskell 98... but that's about the only nice thing I can say about it!
Another possible way:
sameK x y = f x == f y
where f (Alpha _) = 0
f (Beta _) = 1
f (Gamma _ _) = 2
-- runtime error when Delta value encountered
A runtime error is not ideal, but better than silently giving the wrong answer.
You'll need to use a generics library like Scrap Your Boilerplate or uniplate to do this in general.
If you don't want to be so heavy-handed, you can use Dave Hinton's solution, together with the empty record shortcut:
...
where f (Alpha {}) = 0
f (Beta {}) = 1
f (Gamma {}) = 2
So you don't have to know how many args each constructor has. But it obviously still leaves something to be desired.
In some cases, "Scrap Your Boilerplate" library will help.
http://www.haskell.org/haskellwiki/Scrap_your_boilerplate
You can definitely use the generics to eliminate the boilerplate. Your code is a textbook example why I (and many others never use the _
wildcard at top level). While it is tedious to write out all the cases, it is less tedious than dealing with the bugs.
In this happy example I would not only use Dave Hinton's solution but would slap an INLINE pragma on the auxiliary function f
.