Why is parser combinator “seq” defined with “bind”

I am reading this article about parser combinators and didn't understand the following:

They say that using seq (see below) leads to parsers with nested tuples as results, which are messy to manipulate.

 seq :: Parser a -> Parser b -> Parser (a,b)
 p ‘seq‘ q = \inp -> [((v,w),inp’’) | (v,inp’) <- p inp, (w,inp’’) <- q inp’]

In order to avoid this problem of nested tuples they introduce monadic bind and return for parsers and then define seq as follows:

 p ‘seq‘ q = p ‘bind‘ \x -> q ‘bind‘ \y -> result (x,y)

Unfortunately I don not see what the problem of nested tuples is and why the 2-nd implementation of seq is better then the 1-st one. Could you help me to understand it?

标签： parsing haskell functional-programming

2条回答

甜甜的少女心

2楼-- · 2019-07-07 13:06

The first example extendend to the type ((a,b),(c,d,e)):

seq232 ((p,q),(r,s,t) = \inp ->
     [ (((v,w),(x,y,z)),inp’’''')
     | (v, inp’) <- p inp
     , (w, inp’’) <- q inp’
     , (x, inp''') <- r inp''
     , (y, inp'''') <- s inp'''
     , (z, inp''''') <- t imp''''
     ]

The second example extended to the type ((a,b),(c,d,e)):

seq232 ((p,q),(r,s,t)) =
    p ‘bind‘ \v ->
    q ‘bind‘ \w ->
    r `bind` \x ->
    s `bind` \y ->
    t `bind` \z ->
    result ((v,w),(x,y,z))

While it isn't a a lot better, I think you can see that the second is a bit cleaner.

0人赞添加讨论(0) 举报

Rolldiameter

3楼-- · 2019-07-07 13:22

I dare to say that the applicative interface is actually older than the monadic interface. It is unfortunately only getting more attention in recent years.

The idea is that with the seq combinator (not a fortunate name) the elementary parsers return a value of a "simple" type, which are then combined into a final result which is a more complex type.

The idea of the applicative interface is to have a sequential composition combinator of type

(<*>) :: Parser (b -> a) -> Parser b -> Parser a

We start out by lifting a function (which has a "complicated" type) into a parser which is then used to combine the arguments returns by subsequent parsers into a result. Special versions are <* and *> which discard the result at the side of the missing angle bracket. Together you will find this interface described in the module Control.Applicative.

Why are monadic parsers to be avoided? Since the right hand side parser of a >>= combinator depends on the result of the parser at the left hand side this parser is constructed over and over again during parsing. The message is: only use the monadic interface if it is really needed, i.e. if you want to construct parsers which depend on parsing results of parser which run earlier.

An example of an applicative parser. Suppose we want to recognise nested parentheses and compute the maximum nesting depth. The grammar S -> (S)S | epsilon describes this structure, which is directly reflected in the parser.

pS = (max . (+1)) <$ pSym '(' <*> pS <* pSym ')' <|> pure 0

A typical example of a case where monadic parsers come in handy is in e.g. recognising a^{n}b^{n}c^{n}. We start by recognising a number of a's and then want to see that many b'c and c's

pAs       = length <$> many(pSym 'a')
pSyms c 0 = pure 0
pSyms c n = (+1) <$ pSym c <*> pSyms c (n-1)
pABC = do count <- pAs
          pSyms 'b' count
          pSyms 'c' count

Essentially is that in the next statements in the do-construct the result of earlier statements is being used, and not only at the end to combine them without performing any parsing in the combination step.

Why is parser combinator “seq” defined with “bind”

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间