How would I implement this fold function?

2019-06-28 04:15发布

问题:

Given are the two datatypes Color and Plant.

data Color = Red | Pink | White | Blue | Purple | Green | Yellow
   deriving (Show, Eq)

data Plant =
     Leaf
   | Blossom Color
   | Stalk Plant Plant
   deriving (Show, Eq)

Now I'm supposed to implement a function fold_plant of following type:

(x -> x -> x) -> (Color -> x) -> x -> Plant -> x

The way I understand a fold function is that it takes a list and for each iteration it removes the first element from the list and does something with that element.

Apparently fold_plant Stalk Blossom Leaf is the identity on plants.

Now I know that in Haskell you make functions like this:

fold_plant :: (x -> x -> x) -> (Color -> x) -> x -> Plant -> x
fold_plant = do something

But from here on I don't know how a fold function would work an a plant.

回答1:

If we look at the function signature:

fold_plant :: (x -> x -> x) -> (Color -> x) -> x -> Plant -> x
--            \_____ _____/    \____ _____/    |      |      |
--                  v               v          v      v      v
--                stalk           blossom     leaf  tree  output

We see that there is stalk part as well as a blossom part and a leaf part. We will name the stalk function s and the blossom function b here and the leaf part l. In order to simplify (and optimize) the function, we will unpack these three parameters, and then call a recursive method:

fold_plant s b l = fold_plant'
    where fold_plant' = ...

Now the question is of course what to do with fold_plant'. Given we see a Leaf, we do not need to perform any operation on the values, we simply return our leaf result l:

fold_plant' Leaf = l

In case we find a (Blossom c) with c a color, we have to perform a mapping from the c :: Color to an x with the b part to obtain the new value:

fold_plant' (Blossom c) = b c

Finally in case we have a Stalk we will have to perform recursion: we first call fold_plant' on the left stalk, and then we call the fold_plant' and construct an s with the two results:

fold_plant' (Stalk lef rig) = s (fold_plant' lef) (fold_plant' rig)

So we can put it all together into the following function:

fold_plant s b l = fold_plant'
    where fold_plant' Leaf = l
          fold_plant' (Blossom c) = b c
          fold_plant' (Stalk lef rig) = s (fold_plant' lef) (fold_plant' rig)


回答2:

A fold is a function that takes a piece of data in a structure, and collapses it to another piece of data. Usually we do this to "reduce" a collection to a single value. That's why if you look at other languages like Lisp, Smalltalk, Ruby, JavaScript, etc you'll find this operation called reduce, which is the poor cousin of folding in Haskell.

I say it's the poor cousin because your intuition on lists is correct, but in Haskell we're much more abstract and general, so our fold functions can work on any kind of structure whose type we've told haskell what fold means for.

So, we can talk about "using addition with fold to turn a list of numbers into a sum value", or we can talk about "using a function to take a family tree of names and collapse it to a list", and so on and so forth. Any time we have this idea of changing the structure of something to a single value, or maybe to a different structured set of values, that's folding.

The "canonical" way to represent this in Haskell is foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b but it's easier like you've thought to use "a list of a" as the Foldable f => t a type at the beginning coz it's a bit easier to understand. So we have a specialised type of foldr :: (a -> b -> b) -> b -> [a] -> b. But what are a and b ? and what is (a -> b -> b) and what do these three arguments do?

Let's specialise it to Int values for both a and b: foldr :: (Int -> Int -> Int) -> Int -> [Int] -> Int wow... that makes it ... interesting doesn't it? so foldr takes a function of two ints, like, say, the (+) function, and it takes a single Int value (this is the initial value it will use as the target result, and a list of Int values... then it'll produce a single Int from that... that is, it takes that Int -> Int -> Int function and applies it to the single Int and the first of the [Int], then applies that function to that result and the next value of the [Int] and so on and so forth until there are no more [Int] left... then that's what it returns.

It's quite literally folding the function over the data structure.

That's all well and good for lists, which are a single straight line, but what you have is a tree, not a list. So how does it work there? Well, let's see how we might specialise foldr to produce a pair of the highest and lowest numbers from a list of Int instead? foldr :: (Int -> (Int, Int) -> (Int, Int)) -> (Int, Int) -> [Int] -> (Int, Int). So we take a function that takes an Int and a pair, and we put the initial pair into it, along with the first Int from our [Int]. That returns us a new pair, and we then do the same process for the next of the [Int] and then we continue that process untill all we're left with is a pair at the end.

foldToMinMax = foldr (\newNum (minnum,maxnum) -> (min minnum newNum, max maxnum newNum)) (maxBound :: Int, minBound :: Int)

So now things are getting a little bit clearer.

What about this tree of flowers you have, though? Well, you need to write yourself a folding function that will take two values, one of which will be the same value as the initial value and the result value, and the other of which will be the type of your tree, and build a value of the result type. If I were to use pseudo code to write the types in a more descriptive way, I'd probably write something like: foldr :: (contentOfCollectionType -> resultType -> resultType) -> resultType -> (collectionWrapper contentOfCollectionType) -> resultType

But you don't have to use foldr here, in fact you can't use it unless you do some fancy typeclass instantiation stuff anyway. You can totally write your own folding function using plain recursion. That's what they're after.

If you want to learn about recursion and folding and whatnot, and you don't understand these things yet, I recommend the book I helped author. http://happylearnhaskelltutorial.com It explains it in much more detail and with many clear examples. If you understand the basics, it should be pretty quick to get up to speed on the point where you want to know about recursion and folding... but if you don't, well it's going to be very useful for you to understand the basics because you need to know them before getting to the other stuff.

I should mention that your particular fold also has a conversion function in it. It's the thing that converts a Color to an x. The function that you were given to work with as a folding function "crushes x's together" (ie takes two x values and produces another x value, very similar to (+) in our examples above). It can only work on trees because we also give it this function to turn a Color into an x, which effectively takes the meaningful data out of the tree and puts it into a form that the folding function can use.

There's a very beautiful pattern at work here.

Good luck!



回答3:

Folding is the essence of recursive problem solving:

    data Plant =                        data Result r =    
         Leaf                                RLeaf 
       | Blossom Color                     | RBlossom Color
       | Stalk Plant Plant                 | RStalk r r
            -- recursive data           -- non-recursive data: `r`, not `Result r`!

Recursive problem solving is about combining in a simple manner the results of recursively processing the constituents of the original structure:

    -- single-step reduction semantics:
    -- reduction_step :: ..... -> Result r -> r
    reduction_step :: (r -> r -> r) -> (Color -> r) -> r -> Result r -> r
    reduction_step s b l  RLeaf       = l
    reduction_step s b l (RBlosom c)  = b c
    reduction_step s b l (RStalk x y) = s x y

But to get to this point we need to recurse into the constituent parts of our original structure which are of the same nature as the whole structure, and thus the fold_plant procedure which we seek to create can be applied to them as if already written (recursion!):

    recurse_into :: (r -> r -> r) -> (Color -> r) -> r -> Plant -> Result r
    recurse_into s b l Leaf          = RLeaf
    recurse_into s b l (Blossom c)   = RBlossom c
    recurse_into s b l (Stalk lt rt) = RStalk (fold_plant s b l lt) (fold_plant s b l rt)

so, finally, our fold is simply a composition of the two,

    fold_plant :: (r -> r -> r) -> (Color -> r) -> r -> Plant -> r
    fold_plant s b l plant = reduction_step s b l          --          Result r -> r
                               (recurse_into s b l plant)  -- Plant -> Result r

Do follow the types and convince yourself that everything fits together as it should.

The interim data definition can of course be eschewed and the function definition collapsed, but this is the general procedure to follow.

(cf. recursion schemes).