Recently I've finally started to feel like I understand catamorphisms. I wrote some about them in a recent answer, but briefly I would say a catamorphism for a type abstracts over the process of recursively traversing a value of that type, with the pattern matches on that type reified into one function for each constructor the type has. While I would welcome any corrections on this point or on the longer version in the answer of mine linked above, I think I have this more or less down and that is not the subject of this question, just some background.
Once I realized that the functions you pass to a catamorphism correspond exactly to the type's constructors, and the arguments of those functions likewise correspond to the types of those constructors' fields, it all suddenly feels quite mechanical and I don't see where there is any wiggle room for alternate implementations.
For example, I just made up this silly type, with no real concept of what its structure "means", and derived a catamorphism for it. I don't see any other way I could define a general-purpose fold over this type:
data X a b f = A Int b
| B
| C (f a) (X a b f)
| D a
xCata :: (Int -> b -> r)
-> r
-> (f a -> r -> r)
-> (a -> r)
-> X a b f
-> r
xCata a b c d v = case v of
A i x -> a i x
B -> b
C f x -> c f (xCata a b c d x)
D x -> d x
My question is, does every type have a unique catamorphism (up to argument reordering)? Or are there counterexamples: types for which no catamorphism can be defined, or types for which two distinct but equally reasonable catamorphisms exist? If there are no counterexamples (i.e., the catamorphism for a type is unique and trivially derivable), is it possible to get GHC to derive some sort of typeclass for me that does this drudgework automatically?
A catamorphism (if it exists) is unique by definition. In category theory a catamorphism denotes the unique homomorphism from an initial algebra into some other algebra. To the best of my knowledge in Haskell all catamorphisms exists because Haskell's types form a Cartesian Closed Category where terminal objects, all products and all exponentials exist. See also Bartosz Milewski's blog post about F-algebras, which gives a good introduction to the topic.
The catamorphism associated to a recursive type can be derived mechanically.
Suppose you have a recursively defined type, having multiple constructors, each one with its own arity. I'll borrow OP's example.
Then, we can rewrite the same type by forcing each arity to be one, uncurrying everything. Arity zero (
B
) becomes one if we add a unit type()
.Then, we can reduce the number of constructors to one, exploiting
Either
instead of multiple constructors. Below, we just write infix+
instead ofEither
for brevity.At the term-level, we know we can rewrite any recursive definition as the form
x = f x where f w = ...
, writing an explicit fixed point equationx = f x
. At the type-level, we can use the same method to refector recursive types.Now, we note that we can autoderive a functor instance.
This is possible because in the original type each recursive reference only occurred in positive position. If this does not hold, making
F a b f
not a functor, then we can't have a catamorphism.Finally, we can write the type of
cata
as follows:Is this the OP's
xCata
type? It is. We only have to apply a few type isomorphisms. We use the following algebraic laws:By the way, it's easy to remember these isomorphisms if we write
(a,b)
as a producta*b
, unit()
as1
, anda->b
as a powerb^a
. Indeed they become1) c^(a*b) = (c^a)^b , 2) c^(a+b) = c^a*c^b, 3) c^1 = c
.Anyway, let's start to rewrite the
F a b f w -> w
part, onlyLet's consider the full type now:
Which is indeed (renaming
w=r
) the wanted typeThe "standard" implementation of
cata
isIt takes some effort to understand due to its generality, but this is indeed the intended one.
About automation: yes, this can be automatized, at least in part. There is the package
recursion-schemes
on hackage which allows one to write something likeExample: