pyparsing and line breaks

2019-04-24 02:32发布

问题:

I just started with pyparsing and I have problems with line breaks.

My grammar is:

from pyparsing import *

newline = LineEnd () #Literal ('\n').leaveWhitespace ()
minus = Literal ('-')
plus = Literal ('+')
lparen = Literal ('(')
rparen = Literal (')')
ident = Word (alphas)
integer = Word (nums)

arith = Forward ()
parenthized = Group (lparen + arith + rparen)
atom = ident | integer | parenthized

factor = ZeroOrMore (minus | plus) + atom
arith << (ZeroOrMore (factor + (minus | plus) ) + factor)

statement = arith + newline
program = OneOrMore (statement)

Now when I parse the following:

print (program.parseString ('--1-(-a-3+n)\nx\n') )

The result is as expected:

['-', '-', '1', '-', ['(', '-', 'a', '-', '3', '+', 'n', ')'], '\n', 'x', '\n']

But when the second line can be parsed as tail of the first line, the first \n is magicked away?

Code:

print (program.parseString ('--1-(-a-3+n)\n-x\n') )

Actual result:

['-', '-', '1', '-', ['(', '-', 'a', '-', '3', '+', 'n', ')'], '-', 'x', '\n']

Expected result:

['-', '-', '1', '-', ['(', '-', 'a', '-', '3', '+', 'n', ')'], '\n', '-', 'x', '\n']

Actually I don't want the parser to automatically join statements.

1. What am I doing wrong?

2. How can I fix this?

3. What is happening under the hood causing this behavious (which surely is sensible, but I just fail to see the point)?

回答1:

'\n' is normally skipped over as a whitespace character. If you want '\n' to be significant, then you have to call setDefaultWhitespaceChars to remove '\n' as skippable whitespace (you have to do this before defining any of your pyparsing expressions):

from pyparsing import *
ParserElement.setDefaultWhitespaceChars(' \t')


回答2:

What is happening here is that the parser by default ignores any whitespace. You need to add the following line of code before you define any elements:

ParserElement.setDefaultWhitespaceChars(" \t")

The normal default whitespace characters are " \t\r\n", I believe.

Edit: Paul beat me to it. I should have refreshed after getting dinner together. :)