I'm looking for some really simple, easy-to-grasp explanations of recursion schemes and corecursion schemes (catamorphisms, anamorphisms, hylomorphisms etc.) which do not require following lots of links, or opening a category theory textbook. I'm sure I've reinvented many of these schemes unconsciously and "applied" them in my head during the process of coding (I'm sure many of us have), but I have no clue what the (co)recursion schemes I use are called. (OK, I lied. I've just been reading about a few of them, which prompted this question. But before today, I had no clue.)
I think diffusion of these concepts within the programming community has been hindered by the forbidding explanations and examples one tends to come across - for example on Wikipedia, but also elsewhere.
It's also probably been hindered by their names. I think there are some alternative, less mathematical names (something about bananas and barbed wire?) but I have no clue what the cutsier names are for recursion schemes that I use, either.
I think it would help to use examples with datatypes representing simple real-world problems, rather than abstract data types such as binary trees.
A few references, from the most category-theoretic (but relevant to give a "territory map" that will let you avoid "clicking lots of links") to the simpler & more self-contained:
As far as the "bananas & barbed wire" vocabulary goes, this comes from the original paper of Meijer, Fokkinga & Patterson (and its sequel by other authors), and it is in sum just as notation-heavy as the less cute alternatives : the "names" (bananas, etc) are just a shortcut to the graphical appearance of the ascii notation of the constructions they are pegged to. For example, catamorphisms (i.e. folds) are represented with
(| _ |)
, and the par-with-parenthesis looks like a "banana", hence the name. This is the paper who is most often called "impenetrable", hence not the first thing I'd look up if I were you.The basic reference for those recursion schemes (or more precisely, for a relational approach to those recursion schemes) is Bird & de Moor's Algebra of Programming (the book is unavailable except as a print-on demand, but there are copies available second-hand & it should be in libraries). It contains a more paced & detailed explanation of point-free programming, if still "academic" : the book introduces some category-theoretic vocabulary, though in a self-contained manner. Yet, the exercises (that you wouldn't find in a paper) help.
Sorting morphisms by Lex Augustjein, uses sorting algorithms on various data structures to explain recursion schemes. It is pretty much "recursion schemes for dummies" by construction:
Another approach to making a symbols-free presentation is Jeremy Gibbons' chapter Origami Programming in The Fun of Programming, with some overlap with the previous one. Its bibliography gives a tour of the introductions to the topic.
Edit : Jeremy Gibbons just let me know he has added a link to the bibliography of the whole book on the book's webpage after reading this question. Enjoy !
I'm afraid these last two references only give a solid explanation of (cata|ana|hylo|para)morphisms, but my hope is that this would be enough to tear through the algebraic formalism you can find in more notation-heavy publications. I don't know of any strictly non-category-theoretic explanation of (co-)recursion schemes other than those four.
Extremely loosely speaking, a catamorphism is just a slight generalization of
fold
, and an anamorphism is a slight generalization ofunfold
. (And a hylomorphism is just an unfold followed by a fold.). They're presented in a more rigorous form usually, to make the connection to category theory clearer. The denser form lets us distinguish data (the necessarily finite product of an initial algebra) and codata (the possibly infinite product of a final coalgebra). This distinction lets us guarantee that a fold is never called on an infinite list. The other reason for the funny way that catamorphisms and anamorphisms are generally written is that by operating over F-algebras and F-coalgebras (generated from functors) we can write them once and for all, rather than once over a list, once over a binary tree, etc. This in turn helps make clear exactly why they're all the same thing.But from a pure intuition standpoint, you can think of cata and ana as reducing and producing, and that's about it.
Edit: a bit more
A metamorphism (Gibbons) is like an inside-out hylo -- its a fold followed by an unfold. So you can use it to tear down a stream and build up a new one with a potentially different structure.
Ekmett posted a nice "field guide" to the various schemes in the literature: http://comonad.com/reader/2009/recursion-schemes/
However, while the "intuitive" explanations are straightforward, the linked code is less so, and the blog posts on some of these might be a tad on the complex/forbidding side.
That said, except perhaps for histomorphisms I don't think the rest of the zoo is necessarily something you'd want to think with directly most of the time. If you "get" hylo and meta, you can express nearly anything in terms of them alone. Typically the other morphisms are more restrictive, not less (but therefore give you more properties "for free").
Tim Williams gave a brilliant talk at the London Haskell User Group last night about recursion schemes with a motivating example of each of the ones you mention. Check out the slides:
http://www.timphilipwilliams.com/slides.html
There are references to all the usual suspects (lenses, bananas, barbed wire ala carte etc) at the end of the slides and you could also google "Origami Programming" which is a nice intro that I hadn't come across before.
and the video will be here when it's uploaded:
http://www.youtube.com/user/LondonHaskell
edit Most of the links in question are in huitseeker's answer above.