In order to prove that for instance the Category laws hold for some operations on a data type, how do one decide how to define equality? Considering the following type for representing boolean expressions:
data Exp
= ETrue
| EFalse
| EAnd Exp Exp
deriving (Eq)
Is it feasible trying to prove that Exp forms a Category with identity ETrue and operator:
(<&>) = EAnd
without redefining the Eq instance? Using the default instance of Eq the left-identity law breaks, i.e:
ETrue <&> e == e
evaluates to False. However, defining an eval function:
eval ETrue = True
eval EFalse = False
eval (EAnd e1 e2) = eval e1 && eval e2
and the Eq instance as:
instance Eq Exp where
e1 == e2 = eval e1 == eval e2
fixes the problem. Is comparison in terms of (==) a general requirement for claiming to satisfy such laws, or is it sufficient to say that the laws hold for a particular type of equality operator?
Equality is EVIL. You rarely (if ever) need structural equality,
because it is too strong. You only want an equivalence that is strong enough for
what you're doing. This is particularly true for category theory.
In Haskell, deriving Eq
will give you structural equality, which means that you'll
often want to write your own implementation of ==
/ /=
.
A simple example: Define rational number as pairs of integers,
data Rat = Integer :/ Integer
. If you use structural equality (what Haskell is
deriving
), you'll have (1:/2) /= (2:/4)
, but as a fraction 1/2 == 2/4
. What
you really care about is the value that your tuples denote, not their
representation. This means you'll need an equivalence that compares reduced
fractions, so you should implement that instead.
Side note: If someone using the code assumes that you've defined a structural
equality test, i.e. that checking with ==
justifies replacing data sub-components
through pattern matching, their code may break. If that is of importance,
you may hide the constructors to disallow pattern matching, or maybe define your
own class
(say, Equiv
with ===
and =/=
) to separate both concepts. (This
is mostly important for theorem provers like Agda or Coq, in Haskell it's really
hard to get practical/real-world code so wrong that finally something breaks.)
Really Stupid(TM) example: Let's say that person wants to print long lists of huge
Rat
s and believes memoizing the string representations of the Integer
s will save
on binary-to-decimal conversion. There's a lookup table for Rat
s, such that equal
Rat
s will never be converted twice, and there's a lookup table for integers. If
(a:/b) == (c:/d)
, missing integer entries will be filled by copying between a
-c
/
b
-d
to skip conversion (ouch!). For the list [ 1:/1, 2:/2, 2:/4 ]
, 1
gets
converted and then, because 1:/1 == 2:/2
, the string for 1
gets copied into the
2
lookup entry. The final result "1/1, 1/1, 1/4"
is borked.