I'm trying to understand how Haskell list comprehensions work "under the hood" in regards to pattern matching. The following ghci output illustrates my point:
Prelude> let myList = [Just 1, Just 2, Nothing, Just 3]
Prelude> let xs = [x | Just x <- myList]
Prelude> xs
[1,2,3]
Prelude>
As you can see, it is able to skip the "Nothing" and select only the "Just" values. I understand that List is a monad, defined as (source from Real World Haskell, ch. 14):
instance Monad [] where
return x = [x]
xs >>= f = concat (map f xs)
xs >> f = concat (map (\_ -> f) xs)
fail _ = []
Therefore, a list comprehension basically builds a singleton list for every element selected in the list comprehension and concatenates them. If a pattern match fails at some step, the result of the "fail" function is used instead. In other words, the "Just x" pattern doesn't match so [] is used as a placeholder until 'concat' is called. That explains why the "Nothing" appears to be skipped.
What I don't understand is, how does Haskell know to call the "fail" function? Is it "compiler magic", or functionality that you can write yourself in Haskell? Is it possible to write the following "select" function to work the same way as a list comprehension?
select :: (a -> b) -> [a] -> [b]
select (Just x -> x) myList -- how to prevent the lambda from raising an error?
[1,2,3]
The rule for desugaring a list comprehension requires an expression of the form
[ e | p <- l ]
(wheree
is an expression,p
a pattern, andl
a list expression) behave likePrevious versions of Haskell had monad comprehensions, which were removed from the language because they were hard to read and redundant with the
do
-notation. (List comprehensions are redundant, too, but they aren't so hard to read.) I think desugaring[ e | p <- l ]
as a monad (or, to be precise, as a monad with zero) would yield something likewhere
mzero
is from theMonadPlus
class. This is very close towhich desugars to
When we take the List Monad, we have
I.e., the 3 approaches (list comprehensions, monad comprehensions,
do
expressions) are equivalent for lists.I don't think the list comprehension syntax has much to do with the fact that List (
[]
), orMaybe
for that matter, happens to be an instance of theMonad
type class.List comprehensions are indeed compiler magic or syntax sugar, but that's possible because the compiler knows the structure of the
[]
data type.Here's what the list comprehension is compiled to: (Well, I think, I didn't actually check it against the GHC)
As you can see, the compiler doesn't have to call the
fail
function, it can simply inline a empty list, because it knows what a list is.Interestingly, this fact that the list comprehensions syntax 'skips' pattern match failures is used in some libraries to do generic programming. See the example in the Uniplate library.
Edit: Oh, and to answer your question, you can't call your
select
function with the lambda you gave it. It will indeed fail on a pattern match failure if you call it with anNothing
value.You could pass it the
f
function from the code above, but thanselect
would have the type:which is perfectly fine, you can use the
concatMap
function internally :-)Also, that new
select
now has the type of the monadic bind operator for lists (with its arguments flipped):While implemenatations of Haskell might not do it directly like this internally, it is helpful to think about it this way :)
... becomes:
... which is:
As to your question:
In do-notation, if a pattern binding fails (i.e. the
Just x
), then the fail method is called. For the above example, it would look something like this:So, every time you have a pattern-match in a monadic context that may fail, Haskell inserts a call to
fail
. Try it out with IO: