Given are the two datatypes Color and Plant.
data Color = Red | Pink | White | Blue | Purple | Green | Yellow
deriving (Show, Eq)
data Plant =
Leaf
| Blossom Color
| Stalk Plant Plant
deriving (Show, Eq)
Now I'm supposed to implement a function fold_plant
of following type:
(x -> x -> x) -> (Color -> x) -> x -> Plant -> x
The way I understand a fold function is that it takes a list and for each iteration it removes the first element from the list and does something with that element.
Apparently fold_plant Stalk Blossom Leaf
is the identity on plants.
Now I know that in Haskell you make functions like this:
fold_plant :: (x -> x -> x) -> (Color -> x) -> x -> Plant -> x
fold_plant = do something
But from here on I don't know how a fold function would work an a plant.
If we look at the function signature:
fold_plant :: (x -> x -> x) -> (Color -> x) -> x -> Plant -> x
-- \_____ _____/ \____ _____/ | | |
-- v v v v v
-- stalk blossom leaf tree output
We see that there is stalk
part as well as a blossom
part and a leaf
part. We will name the stalk
function s
and the blossom
function b
here and the leaf
part l
. In order to simplify (and optimize) the function, we will unpack these three parameters, and then call a recursive method:
fold_plant s b l = fold_plant'
where fold_plant' = ...
Now the question is of course what to do with fold_plant'
. Given we see a Leaf
, we do not need to perform any operation on the values, we simply return our leaf result l
:
fold_plant' Leaf = l
In case we find a (Blossom c)
with c
a color, we have to perform a mapping from the c :: Color
to an x
with the b
part to obtain the new value:
fold_plant' (Blossom c) = b c
Finally in case we have a Stalk
we will have to perform recursion: we first call fold_plant'
on the left stalk, and then we call the fold_plant'
and construct an s
with the two results:
fold_plant' (Stalk lef rig) = s (fold_plant' lef) (fold_plant' rig)
So we can put it all together into the following function:
fold_plant s b l = fold_plant'
where fold_plant' Leaf = l
fold_plant' (Blossom c) = b c
fold_plant' (Stalk lef rig) = s (fold_plant' lef) (fold_plant' rig)
A fold is a function that takes a piece of data in a structure, and collapses it to another piece of data. Usually we do this to "reduce" a collection to a single value. That's why if you look at other languages like Lisp, Smalltalk, Ruby, JavaScript, etc you'll find this operation called reduce
, which is the poor cousin of folding in Haskell.
I say it's the poor cousin because your intuition on lists is correct, but in Haskell we're much more abstract and general, so our fold functions can work on any kind of structure whose type we've told haskell what fold means for.
So, we can talk about "using addition with fold to turn a list of numbers into a sum value", or we can talk about "using a function to take a family tree of names and collapse it to a list", and so on and so forth. Any time we have this idea of changing the structure of something to a single value, or maybe to a different structured set of values, that's folding.
The "canonical" way to represent this in Haskell is foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
but it's easier like you've thought to use "a list of a
" as the Foldable f => t a
type at the beginning coz it's a bit easier to understand. So we have a specialised type of foldr :: (a -> b -> b) -> b -> [a] -> b
. But what are a
and b
? and what is (a -> b -> b)
and what do these three arguments do?
Let's specialise it to Int
values for both a
and b
: foldr :: (Int -> Int -> Int) -> Int -> [Int] -> Int
wow... that makes it ... interesting doesn't it? so foldr
takes a function of two ints, like, say, the (+)
function, and it takes a single Int
value (this is the initial value it will use as the target result, and a list of Int
values... then it'll produce a single Int
from that... that is, it takes that Int -> Int -> Int
function and applies it to the single Int
and the first of the [Int]
, then applies that function to that result and the next value of the [Int]
and so on and so forth until there are no more [Int]
left... then that's what it returns.
It's quite literally folding the function over the data structure.
That's all well and good for lists, which are a single straight line, but what you have is a tree, not a list. So how does it work there? Well, let's see how we might specialise foldr
to produce a pair of the highest and lowest numbers from a list of Int
instead? foldr :: (Int -> (Int, Int) -> (Int, Int)) -> (Int, Int) -> [Int] -> (Int, Int)
. So we take a function that takes an Int
and a pair, and we put the initial pair into it, along with the first Int
from our [Int]
. That returns us a new pair, and we then do the same process for the next of the [Int]
and then we continue that process untill all we're left with is a pair at the end.
foldToMinMax = foldr (\newNum (minnum,maxnum) -> (min minnum newNum, max maxnum newNum)) (maxBound :: Int, minBound :: Int)
So now things are getting a little bit clearer.
What about this tree of flowers you have, though? Well, you need to write yourself a folding function that will take two values, one of which will be the same value as the initial value and the result value, and the other of which will be the type of your tree, and build a value of the result type. If I were to use pseudo code to write the types in a more descriptive way, I'd probably write something like: foldr :: (contentOfCollectionType -> resultType -> resultType) -> resultType -> (collectionWrapper contentOfCollectionType) -> resultType
But you don't have to use foldr
here, in fact you can't use it unless you do some fancy typeclass instantiation stuff anyway. You can totally write your own folding function using plain recursion. That's what they're after.
If you want to learn about recursion and folding and whatnot, and you don't understand these things yet, I recommend the book I helped author. http://happylearnhaskelltutorial.com It explains it in much more detail and with many clear examples. If you understand the basics, it should be pretty quick to get up to speed on the point where you want to know about recursion and folding... but if you don't, well it's going to be very useful for you to understand the basics because you need to know them before getting to the other stuff.
I should mention that your particular fold also has a conversion function in it. It's the thing that converts a Color
to an x
. The function that you were given to work with as a folding function "crushes x's together" (ie takes two x
values and produces another x
value, very similar to (+)
in our examples above). It can only work on trees because we also give it this function to turn a Color
into an x
, which effectively takes the meaningful data out of the tree and puts it into a form that the folding function can use.
There's a very beautiful pattern at work here.
Good luck!
Folding is the essence of recursive problem solving:
data Plant = data Result r =
Leaf RLeaf
| Blossom Color | RBlossom Color
| Stalk Plant Plant | RStalk r r
-- recursive data -- non-recursive data: `r`, not `Result r`!
Recursive problem solving is about combining in a simple manner the results of recursively processing the constituents of the original structure:
-- single-step reduction semantics:
-- reduction_step :: ..... -> Result r -> r
reduction_step :: (r -> r -> r) -> (Color -> r) -> r -> Result r -> r
reduction_step s b l RLeaf = l
reduction_step s b l (RBlosom c) = b c
reduction_step s b l (RStalk x y) = s x y
But to get to this point we need to recurse into the constituent parts of our original structure which are of the same nature as the whole structure, and thus the fold_plant
procedure which we seek to create can be applied to them as if already written (recursion!):
recurse_into :: (r -> r -> r) -> (Color -> r) -> r -> Plant -> Result r
recurse_into s b l Leaf = RLeaf
recurse_into s b l (Blossom c) = RBlossom c
recurse_into s b l (Stalk lt rt) = RStalk (fold_plant s b l lt) (fold_plant s b l rt)
so, finally, our fold is simply a composition of the two,
fold_plant :: (r -> r -> r) -> (Color -> r) -> r -> Plant -> r
fold_plant s b l plant = reduction_step s b l -- Result r -> r
(recurse_into s b l plant) -- Plant -> Result r
Do follow the types and convince yourself that everything fits together as it should.
The interim data definition can of course be eschewed and the function definition collapsed, but this is the general procedure to follow.
(cf. recursion schemes).