My question is about Graham Hutton's book Programming in Haskell 1st Ed.
There is a parser created in section 8.4, and I am assuming anyone answering has the book or can see the link to slide 8 in the link above.
A basic parser called item
is described as:
type Parser a = String -> [(a, String)]
item :: Parser Char
item = \inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)]
which is used with do
to define another parser p
(the do
parser)
p :: Parser (Char, Char)
p = do x <- item
item
y <- item
return (x,y)
the relevant bind definition is:
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
p >>= f = \inp -> case parse p inp of
[] -> []
[(v,out)] -> parse (f v) out
return
is defined as:
return :: a -> Parser a
return v = \inp -> [(v,inp)]
parse
is defined as:
parse :: Parser a -> String -> [(a,String)]
parse p inp = p inp
The program (the do
parser) takes a string and selects the 1st and 3rd characters and returns them in a tuple with the remainder of the string in a list, e.g., "abcdef"
produces [('a','c'), "def"]
.
I want to know how the
(f v) out
in
[(v,out)] -> parse (f v) out
returns a parser which is then applied to out
.
f
in thedo
parser isitem
anditem
taking a character'c'
returns[('c',[])]
?How can that be a parser and how can it take
out
as an argument?
Perhaps I am just not understanding what (f v)
does.
Also how does the
do
parser 'drop' the returned values each time to operate on the rest of the input string whenitem
is called again?What is the object that works its way through the
do
parser, and how is it altered at each step, and by what means is it altered?
The relevant advice is, Don't panic (meaning, don't rush it; or, take it slow), and, Follow the types.
First of all,
Parser
sare functions from
String
to lists of pairings of result values of typea
and the leftover Strings. That leftovers string will be used as input for next parsing step. That's the main thing about it here.You are asking, in
The answer is,
f
's type says that it does so:We have
f :: a -> Parser b
, so that's just what it does: applied to a value of typea
it returns a value of typeParser b
. Or equivalently,So whatever is the value that
parse p inp
produces, it must be whatf
is waiting for to proceed. The types must "fit":or, equivalently,
So this also answers your second question,
The real question is, what kind of
f
is it, that does such a thing? Where does it come from? And that's your fourth question.And the answer is, your example in
do
-notation,by Monad laws is equivalent to the nested chain
which is a syntactic sugar for the nested chain of
Parser
bind applications,and it is because the functions are nested that the final
return
has access to bothy
andx
there; and it is precisely theParser
bind that arranges for the output leftovers string to be used as input to the next parsing step:i.e. (under the assumption that
inp
is a string of length two or longer),after few simplifications.
And this answers your third question.
I am having similar problems reading the syntax, because it's not what we are used to.
so for the question: I want to know how the (f v) out in [(v,out)] -> parse (f v) out returns a parser which is then applied to out.
It does because thats the siganture of the 2nd arg (the f): (>>=) :: Parser a -> (a -> Parser b) -> Parser b .... f takes an a and produces a Parser b. a Parser b takes a String which is the out ... (f v) out
But the output of this should not be mixed up with the output of the function we are writing: >>=
A parser is a function that takes 1 arg. This is constructed right after the first = ... ie by returning an (anonymous) function: p >>= f = \inp -> ... so inp refers to the input string of the Parser we are building
so what is left is to define what that constructed function should do ... NOTE: we are not implementing any of the input parsers just chaining them together ... so the ouput Parser function should:
For me the understanding lies in the use of destructuring and the realisation that we are constructing a function that glues together the execution of other functions together simply considering their interface
Hope that helps ... it helped me to write it :-)
f v
produces aParser b
becausef
is a function of typea -> Parser b
andv
is a value of typea
. So then you're callingparse
with thisParser b
and the stringout
as arguments.No, it's not. Let's consider a simplified (albeit now somewhat pointless) version of your parser:
This will desugar to:
So the right operand of
>>=
, i.e.f
, is\x -> return x
, notitem
.When you apply a parser it returns a tuple containing the parsed value and a string representing the rest of the input. If you look at
item
for example, the second element of the tuple will bexs
which is the tail of the input string (i.e. a string containing all characters of the input string except the first). This second part of the tuple will be what's fed as the new input to subsequent parsers (as per[(v,out)] -> parse (f v) out
), so that way each successive parser will take as input the string that the previous parser produced as the second part of its output tuple (which will be a suffix of its input).In response to your comments:
No, it's equivalent to the entire
do
-block (i.e.do {x <- item; return x}
). You can't translatedo
-blocks line-by-line like that.do { x <- foo; rest }
is equivalent tofoo >>= \x -> do {rest}
, so you'll always have the rest of thedo
-block as part of the right operand of>>=
.Let's walk through an example where we invoke
item
twice (this is like yourp
, but without the middle item). In the below I'll use===
to denote that the expressions above and below the===
are equivalent.Now let's apply this to the input "ab":
Now we can expand the second
>>=
the same we did the fist and eventually end up with('a', 'b')
.