Does each type have a unique catamorphism?

2020-07-06 05:54发布

问题:

Recently I've finally started to feel like I understand catamorphisms. I wrote some about them in a recent answer, but briefly I would say a catamorphism for a type abstracts over the process of recursively traversing a value of that type, with the pattern matches on that type reified into one function for each constructor the type has. While I would welcome any corrections on this point or on the longer version in the answer of mine linked above, I think I have this more or less down and that is not the subject of this question, just some background.

Once I realized that the functions you pass to a catamorphism correspond exactly to the type's constructors, and the arguments of those functions likewise correspond to the types of those constructors' fields, it all suddenly feels quite mechanical and I don't see where there is any wiggle room for alternate implementations.

For example, I just made up this silly type, with no real concept of what its structure "means", and derived a catamorphism for it. I don't see any other way I could define a general-purpose fold over this type:

data X a b f = A Int b
             | B
             | C (f a) (X a b f)
             | D a

xCata :: (Int -> b -> r)
      -> r
      -> (f a -> r -> r)
      -> (a -> r)
      -> X a b f
      -> r
xCata a b c d v = case v of
  A i x -> a i x
  B -> b
  C f x -> c f (xCata a b c d x)
  D x -> d x

My question is, does every type have a unique catamorphism (up to argument reordering)? Or are there counterexamples: types for which no catamorphism can be defined, or types for which two distinct but equally reasonable catamorphisms exist? If there are no counterexamples (i.e., the catamorphism for a type is unique and trivially derivable), is it possible to get GHC to derive some sort of typeclass for me that does this drudgework automatically?

回答1:

The catamorphism associated to a recursive type can be derived mechanically.

Suppose you have a recursively defined type, having multiple constructors, each one with its own arity. I'll borrow OP's example.

data X a b f = A Int b
             | B
             | C (f a) (X a b f)
             | D a

Then, we can rewrite the same type by forcing each arity to be one, uncurrying everything. Arity zero (B) becomes one if we add a unit type ().

data X a b f = A (Int, b)
             | B ()
             | C (f a, X a b f)
             | D a

Then, we can reduce the number of constructors to one, exploiting Either instead of multiple constructors. Below, we just write infix + instead of Either for brevity.

data X a b f = X ((Int, b) + () + (f a, X a b f) + a)

At the term-level, we know we can rewrite any recursive definition as the form x = f x where f w = ..., writing an explicit fixed point equation x = f x. At the type-level, we can use the same method to refector recursive types.

data X a b f   = X (F (X a b f))   -- fixed point equation
data F a b f w = F ((Int, b) + () + (f a, w) + a)

Now, we note that we can autoderive a functor instance.

deriving instance Functor (F a b f)

This is possible because in the original type each recursive reference only occurred in positive position. If this does not hold, making F a b f not a functor, then we can't have a catamorphism.

Finally, we can write the type of cata as follows:

cata :: (F a b f w -> w) -> X a b f -> w

Is this the OP's xCata type? It is. We only have to apply a few type isomorphisms. We use the following algebraic laws:

1) (a,b) -> c ~= a -> b -> c          (currying)
2) (a+b) -> c ~= (a -> c, b -> c)
3) ()    -> c ~= c

By the way, it's easy to remember these isomorphisms if we write (a,b) as a product a*b, unit () as1, and a->b as a power b^a. Indeed they become 1) c^(a*b) = (c^a)^b , 2) c^(a+b) = c^a*c^b, 3) c^1 = c.

Anyway, let's start to rewrite the F a b f w -> w part, only

   F a b f w -> w
=~ (def F)
   ((Int, b) + () + (f a, w) + a) -> w
=~ (2)
   ((Int, b) -> w, () -> w, (f a, w) -> w, a -> w)
=~ (3)
   ((Int, b) -> w, w, (f a, w) -> w, a -> w)
=~ (1)
   (Int -> b -> w, w, f a -> w -> w, a -> w)

Let's consider the full type now:

cata :: (F a b f w -> w) -> X a b f -> w
     ~= (above)
        (Int -> b -> w, w, f a -> w -> w, a -> w) -> X a b f -> w
     ~= (1)
           (Int -> b -> w)
        -> w
        -> (f a -> w -> w)
        -> (a -> w)
        -> X a b f
        -> w

Which is indeed (renaming w=r) the wanted type

xCata :: (Int -> b -> r)
      -> r
      -> (f a -> r -> r)
      -> (a -> r)
      -> X a b f
      -> r

The "standard" implementation of cata is

cata g = wrap . fmap (cata g) . unwrap
   where unwrap (X y) = y
         wrap   y = X y

It takes some effort to understand due to its generality, but this is indeed the intended one.


About automation: yes, this can be automatized, at least in part. There is the package recursion-schemes on hackage which allows one to write something like

type X a b f = Fix (F a f b)
data F a b f w = ...  -- you can use the actual constructors here
       deriving Functor

-- use cata here

Example:

import Data.Functor.Foldable hiding (Nil, Cons)

data ListF a k = NilF | ConsF a k deriving Functor
type List a = Fix (ListF a)

-- helper patterns, so that we can avoid to match the Fix
-- newtype constructor explicitly    
pattern Nil = Fix NilF
pattern Cons a as = Fix (ConsF a as)

-- normal recursion
sumList1 :: Num a => List a -> a
sumList1 Nil         = 0
sumList1 (Cons a as) = a + sumList1 as

-- with cata
sumList2 :: forall a. Num a => List a -> a
sumList2 = cata h
   where
   h :: ListF a a -> a
   h NilF        = 0
   h (ConsF a s) = a + s

-- with LambdaCase
sumList3 :: Num a => List a -> a
sumList3 = cata $ \case
   NilF      -> 0
   ConsF a s -> a + s


回答2:

A catamorphism (if it exists) is unique by definition. In category theory a catamorphism denotes the unique homomorphism from an initial algebra into some other algebra. To the best of my knowledge in Haskell all catamorphisms exists because Haskell's types form a Cartesian Closed Category where terminal objects, all products and all exponentials exist. See also Bartosz Milewski's blog post about F-algebras, which gives a good introduction to the topic.