Recently I've finally started to feel like I understand catamorphisms. I wrote some about them in a recent answer, but briefly I would say a catamorphism for a type abstracts over the process of recursively traversing a value of that type, with the pattern matches on that type reified into one function for each constructor the type has. While I would welcome any corrections on this point or on the longer version in the answer of mine linked above, I think I have this more or less down and that is not the subject of this question, just some background.
Once I realized that the functions you pass to a catamorphism correspond exactly to the type's constructors, and the arguments of those functions likewise correspond to the types of those constructors' fields, it all suddenly feels quite mechanical and I don't see where there is any wiggle room for alternate implementations.
For example, I just made up this silly type, with no real concept of what its structure "means", and derived a catamorphism for it. I don't see any other way I could define a general-purpose fold over this type:
data X a b f = A Int b
| B
| C (f a) (X a b f)
| D a
xCata :: (Int -> b -> r)
-> r
-> (f a -> r -> r)
-> (a -> r)
-> X a b f
-> r
xCata a b c d v = case v of
A i x -> a i x
B -> b
C f x -> c f (xCata a b c d x)
D x -> d x
My question is, does every type have a unique catamorphism (up to argument reordering)? Or are there counterexamples: types for which no catamorphism can be defined, or types for which two distinct but equally reasonable catamorphisms exist? If there are no counterexamples (i.e., the catamorphism for a type is unique and trivially derivable), is it possible to get GHC to derive some sort of typeclass for me that does this drudgework automatically?
The catamorphism associated to a recursive type can be derived mechanically.
Suppose you have a recursively defined type, having multiple constructors, each one with its own arity. I'll borrow OP's example.
data X a b f = A Int b
| B
| C (f a) (X a b f)
| D a
Then, we can rewrite the same type by forcing each arity to be one, uncurrying everything. Arity zero (B
) becomes one if we add a unit type ()
.
data X a b f = A (Int, b)
| B ()
| C (f a, X a b f)
| D a
Then, we can reduce the number of constructors to one, exploiting Either
instead of multiple constructors. Below, we just write infix +
instead of Either
for brevity.
data X a b f = X ((Int, b) + () + (f a, X a b f) + a)
At the term-level, we know we can rewrite any recursive definition
as the form x = f x where f w = ...
, writing an explicit fixed point equation x = f x
. At the type-level, we can use the same method
to refector recursive types.
data X a b f = X (F (X a b f)) -- fixed point equation
data F a b f w = F ((Int, b) + () + (f a, w) + a)
Now, we note that we can autoderive a functor instance.
deriving instance Functor (F a b f)
This is possible because in the original type each recursive reference only occurred in positive position. If this does not hold, making F a b f
not a functor, then we can't have a catamorphism.
Finally, we can write the type of cata
as follows:
cata :: (F a b f w -> w) -> X a b f -> w
Is this the OP's xCata
type? It is. We only have to apply a few type isomorphisms. We use the following algebraic laws:
1) (a,b) -> c ~= a -> b -> c (currying)
2) (a+b) -> c ~= (a -> c, b -> c)
3) () -> c ~= c
By the way, it's easy to remember these isomorphisms if we write (a,b)
as a product a*b
, unit ()
as1
, and a->b
as a power b^a
. Indeed they become 1) c^(a*b) = (c^a)^b , 2) c^(a+b) = c^a*c^b, 3) c^1 = c
.
Anyway, let's start to rewrite the F a b f w -> w
part, only
F a b f w -> w
=~ (def F)
((Int, b) + () + (f a, w) + a) -> w
=~ (2)
((Int, b) -> w, () -> w, (f a, w) -> w, a -> w)
=~ (3)
((Int, b) -> w, w, (f a, w) -> w, a -> w)
=~ (1)
(Int -> b -> w, w, f a -> w -> w, a -> w)
Let's consider the full type now:
cata :: (F a b f w -> w) -> X a b f -> w
~= (above)
(Int -> b -> w, w, f a -> w -> w, a -> w) -> X a b f -> w
~= (1)
(Int -> b -> w)
-> w
-> (f a -> w -> w)
-> (a -> w)
-> X a b f
-> w
Which is indeed (renaming w=r
) the wanted type
xCata :: (Int -> b -> r)
-> r
-> (f a -> r -> r)
-> (a -> r)
-> X a b f
-> r
The "standard" implementation of cata
is
cata g = wrap . fmap (cata g) . unwrap
where unwrap (X y) = y
wrap y = X y
It takes some effort to understand due to its generality, but this is indeed the intended one.
About automation: yes, this can be automatized, at least in part.
There is the package recursion-schemes
on hackage which allows
one to write something like
type X a b f = Fix (F a f b)
data F a b f w = ... -- you can use the actual constructors here
deriving Functor
-- use cata here
Example:
import Data.Functor.Foldable hiding (Nil, Cons)
data ListF a k = NilF | ConsF a k deriving Functor
type List a = Fix (ListF a)
-- helper patterns, so that we can avoid to match the Fix
-- newtype constructor explicitly
pattern Nil = Fix NilF
pattern Cons a as = Fix (ConsF a as)
-- normal recursion
sumList1 :: Num a => List a -> a
sumList1 Nil = 0
sumList1 (Cons a as) = a + sumList1 as
-- with cata
sumList2 :: forall a. Num a => List a -> a
sumList2 = cata h
where
h :: ListF a a -> a
h NilF = 0
h (ConsF a s) = a + s
-- with LambdaCase
sumList3 :: Num a => List a -> a
sumList3 = cata $ \case
NilF -> 0
ConsF a s -> a + s
A catamorphism (if it exists) is unique by definition. In category theory a catamorphism denotes the unique homomorphism from an initial algebra into some other algebra. To the best of my knowledge in Haskell all catamorphisms exists because Haskell's types form a Cartesian Closed Category where terminal objects, all products and all exponentials exist. See also Bartosz Milewski's blog post about F-algebras, which gives a good introduction to the topic.