bind can be composed of fmap and join, so do we ha

2019-06-17 22:53发布

问题:

I don't use Haskell a lot, but I understand the concept of Monads.

I had been confused by Kleisli triple, and the category, however,

fmap and join

Although Haskell defines monads in terms of the return and bind functions, it is also possible to define a monad in terms of return and two other operations, join and fmap. This formulation fits more closely with the original definition of monads in category theory. The fmap operation, with type (t → u) → M t → M u, takes a function between two types and produces a function that does the "same thing" to values in the monad. The join operation, with type M (M t) → M t, "flattens" two layers of monadic information into one.

helps me to the background principle of Monads.

The two formulations are related as follows:

fmap f m = m >>= (return . f)
join n   = n >>= id

fmap :: (a -> b) -> (m a -> m b)
unit :: a -> m a
join :: m (m a) -> m a
>>=  :: m a -> (a -> m b) -> m b

m >>= f  =  join $ fmap f m

My question is: I think that since >>= can be composed of fmap and join, a monadic function a -> m b is not required and normal functions a -> b will satisfy the operation, but so many tutorials around the web still insist to use a monadic functions since that is the Kleisli triple and the monad-laws.

Well, shouldn't we just use non-monadic functions, as long as they are endo-functions, for the simplicity? What do I miss?

Related topics are

Monad join function

Haskell Monad bind operator confusion

Difference in capability between fmap and bind?

回答1:

In a sense, you're right. As every monad m is a functor, we can use fmap f with a function f :: a -> b to turn an m a into an m b, but there's a catch. What's b?

I like to think of such an m as meaning "plan-to-get", where "plans" involve some sort of additional interaction beyond pure computation. If you have a "plan-to-get Int" and you want a "plan-to-get String", you can use fmap with a function in Int -> String, but the type of that function tells you that getting the String from the Int involves no further interaction.

That isn't always so: perhaps the Int is a student registration number and the String is their name, so the plan to convert from one to the other needs an external lookup in some table. Then I don't have a pure function from Int to String, but rather a pure function from Int to "plan-to-get String". If I fmap that across my "plan-to-get Int", that's fine, but I end up with "plan-to-get (plan-to-get String)" and I need to join the outer and inner plans.

The general situation is that we have enough information to compute the plan to get more. That's what a -> m b models. In particular, we have return :: a -> m a, which turns the information we have into the plan that gives us exactly that information by taking no further action, and we have (>=>) :: (a -> m b) -> (b -> m c) -> (a -> m c) which composes two such things. We also have that (>=>) is associative and absorbs return on left and right, much the way ; is associative and absorbs skip in classic imperative programming.

It's more convenient to build larger plans from smaller ones using this compositional approach, keeping the number of "plan-to-get" layers a consistent one. Otherwise, you need to build up an n-layer plan with fmap, then do the right number of joins on the outside (which will be a brittle property of the plan).

Now, as Haskell is a language with a concept of "free variable" and "scope", the a in

(>=>) :: (a -> m b) -> (b -> m c) -> (a -> m c)

representing the "overall input information" can just be taken to come from the scope of things we already have, leaving

(>>=) ::       m b  -> (b -> m c) ->       m c

and we get back "bind", which is the tool that presents the compositional structure in the most programmer-friendly form, resembling a local definition.

To sum up, you can work with a -> b, but often you need b to be "plan-to-get something", and that's the helpful thing to choose if you want to build plans compositionally.



回答2:

I'm having a bit of a hard time understanding what your question actually is, but I'll take a crack at it anyway.

I think that since >>= can be composed of fmap and join, a monadic function a -> m b is not required and normal functions a -> b will satisfy the operation,

I expect you're referring to the "monadic function a -> m b" in the type for >>=, is that correct? Well, let's see what happens when we replace that with a function of type a -> b:

(>>=) :: m a -> (a -> m b) -> m b  -- standard version
(>>=) :: m a -> (a -> b) -> m b    -- your version

But doesn't this look familiar? It's equivalent to fmap :: (a -> b) -> m a -> m b, but with the parameters switched. In fact, the implementation is just x >>= y = fmap y x, no need for join. So there's your answer: if you use a "normal function a -> b" instead of a "monadic function a -> m b", you no longer have a monad. Instead, you have a Functor.


but so many tutorials around the web still insist to use a monadic functions since that is the Kleisli triple and the monad-laws.

I'm not sure which tutorials you're looking at. In my experience, the nature of tutorials is that they insist on whatever they're a tutorial for. It would be weird if a tutorial for Monads started presenting problems, and then suggesting things other than Monads as solutions; at the very least, that would be out of the tutorial's scope, and a waste of time for anyone reading it to learn about Monads.


Well, shouldn't we just use non-monadic functions, as long as they are endo-functions, for the simplicity? What do I miss?

Endofunctions are functions of type a -> a. Given the context of your question, I think you actually mean pure functions of type a -> b ("pure" as opposed to inherently monadic functions such as readIORef that need to be type a -> m b). If my assumption is wrong, let me know, and I'll edit the question.

EDIT:
As suggested in a comment by @duplode, it's likely that you mean endofunctor, which in Haskell is just any type of Functor. In this case, the below still applies.

In situations where Monad isn't necessary, it is often simpler to use Applicative, Functor, or just basic pure functions. In these cases, these things should be (and generally are) used in place of a Monad. For example:

ws <- getLine >>= return . words  -- Monad
ws <- words <$> getLine           -- Functor (much nicer)

To be clear: If it's possible without a monad, and it's simpler and more readable without a monad, then you should do it without a monad! If a monad makes the code more complex or confusing than it needs to be, don't use a monad! Haskell has monads for the sole purpose of making certain complex computations simpler, easier to read, and easier to reason about. If that's not happening, you shouldn't be using a monad.



回答3:

There is no reason.

I think that since >>= can be composed of fmap and join, a monadic function a -> m b is not required

Yes, you're totally right. We don't need to require >>= for a monad, we could also require join instead. The two are totally equivalent. As we can also compose join = (>>= id), we can do either

class Monad m where
    return :: a -> m a
    fmap :: (a -> b) -> m a -> m b
    (=<<) :: (a -> m b) -> m a -> m b
    -- join = (=<<) id

or

class Monad m where
    return :: a -> m a
    fmap :: (a -> b) -> m a -> m b
    join :: m (m a) -> m a
    -- (=<<) = (join .) . fmap

It doesn't matter which one we use. Admittedly, the latter looks simpler because there is only one higher-order function (fmap), in the former the types of fmap and =<< look too similar. join gives a better idea of what distinguishes a monad from a functor.

Versatility

We can derive >>= from fmap and join, but we can derive join from >>= only. In fact, we can even derive fmap from >>= and return. So should we say that >>= is more basic than the other? More powerful? Or maybe just: more convoluted?

We would rather like to write

data Maybe a = Just a | Nothing
implement Functor Maybe where
    fmap = (=<<) . (return .) -- maybe not even write this ourselves
implement Monad Maybe where
    return = Just
    f =<< Just x = f x
    _ =<< Nothing  = Nothing

than

data Maybe a = Just a | Nothing
implement Functor Maybe where
    fmap f (Just x) = Just (f x)
    fmap f Nothing  = Nothing
implement Monad Maybe where
    return x = Just x
    join (Just (Just x)) = Just x
    join (Just Nothing)  = Nothing
    join Nothing         = Nothing

The former solution using >>= is minimal.

Convenience

Well, shouldn't we just use non-monadic functions for the simplicity?

No. The whole idea of defining the monad typeclass is to ease working with monadic functions. On their own, the return/fmap/join functions are pretty useless, what we are interested in are other functions that return the monadic data type: tryParse :: String -> Maybe Int for example.

And the whole idea behind monads is that we can arbitrarily chain and nest them, getting back a plain type in the end. After having parsed a number, we need to validate the optional result (giving us another optional result) - in the monad (fmap validate) before getting it back out. There are usually no operations that yield nested data directly, we only get nested monad types because we do further monadic operations inside a monadic type. And we'd much rather write

tryRead = (=<<) validate . tryParse

than

tryRead = join . fmap validate . tryParse

That's why >>= is more important for using monads in daily life than join. I would also guess that having to implement >>= directly, rather than implement join explicitly and have >>= get derived from it, allows for better (easier) compiler optimisations.