Why don't Haskell list comprehensions cause an

2019-02-05 20:08发布

问题:

I'm trying to understand how Haskell list comprehensions work "under the hood" in regards to pattern matching. The following ghci output illustrates my point:

Prelude> let myList = [Just 1, Just 2, Nothing, Just 3]
Prelude> let xs = [x | Just x <- myList]
Prelude> xs
[1,2,3]
Prelude>

As you can see, it is able to skip the "Nothing" and select only the "Just" values. I understand that List is a monad, defined as (source from Real World Haskell, ch. 14):

instance Monad [] where
    return x = [x]
    xs >>= f = concat (map f xs)
    xs >> f = concat (map (\_ -> f) xs)
    fail _ = []

Therefore, a list comprehension basically builds a singleton list for every element selected in the list comprehension and concatenates them. If a pattern match fails at some step, the result of the "fail" function is used instead. In other words, the "Just x" pattern doesn't match so [] is used as a placeholder until 'concat' is called. That explains why the "Nothing" appears to be skipped.

What I don't understand is, how does Haskell know to call the "fail" function? Is it "compiler magic", or functionality that you can write yourself in Haskell? Is it possible to write the following "select" function to work the same way as a list comprehension?

select :: (a -> b) -> [a] -> [b]
select (Just x -> x) myList       -- how to prevent the lambda from raising an error?
[1,2,3]

回答1:

While implemenatations of Haskell might not do it directly like this internally, it is helpful to think about it this way :)

[x | Just x <- myList]

... becomes:

do
    Just x <- myList
    return x

... which is:

myList >>= \(Just x) -> return x

As to your question:

What I don't understand is, how does Haskell know to call the "fail" function?

In do-notation, if a pattern binding fails (i.e. the Just x), then the fail method is called. For the above example, it would look something like this:

myList >>= \temp -> case temp of
    (Just x) -> return x
    _        -> fail "..."

So, every time you have a pattern-match in a monadic context that may fail, Haskell inserts a call to fail. Try it out with IO:

main = do
    (1,x) <- return (0,2)
    print x -- x would be 2, but the pattern match fails


回答2:

The rule for desugaring a list comprehension requires an expression of the form [ e | p <- l ] (where e is an expression, p a pattern, and l a list expression) behave like

let ok p = [e]
    ok _ = []
in concatMap ok l

Previous versions of Haskell had monad comprehensions, which were removed from the language because they were hard to read and redundant with the do-notation. (List comprehensions are redundant, too, but they aren't so hard to read.) I think desugaring [ e | p <- l ] as a monad (or, to be precise, as a monad with zero) would yield something like

let ok p = return e
    ok _ = mzero
in l >>= ok

where mzero is from the MonadPlus class. This is very close to

do { p <- l; return e }

which desugars to

let ok p = return e
    ok _ = fail "..."
in l >>= ok

When we take the List Monad, we have

return e = [e]
mzero = fail _ = []
(>>=) = flip concatMap

I.e., the 3 approaches (list comprehensions, monad comprehensions, do expressions) are equivalent for lists.



回答3:

I don't think the list comprehension syntax has much to do with the fact that List ([]), or Maybe for that matter, happens to be an instance of the Monad type class.

List comprehensions are indeed compiler magic or syntax sugar, but that's possible because the compiler knows the structure of the [] data type.

Here's what the list comprehension is compiled to: (Well, I think, I didn't actually check it against the GHC)

xs = let f = \xs -> case xs of
                      Just x -> [x]
                      _      -> []
     in concatMap f myList

As you can see, the compiler doesn't have to call the fail function, it can simply inline a empty list, because it knows what a list is.


Interestingly, this fact that the list comprehensions syntax 'skips' pattern match failures is used in some libraries to do generic programming. See the example in the Uniplate library.


Edit: Oh, and to answer your question, you can't call your select function with the lambda you gave it. It will indeed fail on a pattern match failure if you call it with an Nothing value.

You could pass it the f function from the code above, but than select would have the type:

select :: (a -> [b]) -> [a] -> [b]

which is perfectly fine, you can use the concatMap function internally :-)

Also, that new select now has the type of the monadic bind operator for lists (with its arguments flipped):

(>>=) :: [a] -> (a -> [b]) -> [b]
xs >>= f = concatMap f xs -- 'or as you said: concat (map f xs)