I just started with pyparsing
and I have problems with line breaks.
My grammar is:
from pyparsing import *
newline = LineEnd () #Literal ('\n').leaveWhitespace ()
minus = Literal ('-')
plus = Literal ('+')
lparen = Literal ('(')
rparen = Literal (')')
ident = Word (alphas)
integer = Word (nums)
arith = Forward ()
parenthized = Group (lparen + arith + rparen)
atom = ident | integer | parenthized
factor = ZeroOrMore (minus | plus) + atom
arith << (ZeroOrMore (factor + (minus | plus) ) + factor)
statement = arith + newline
program = OneOrMore (statement)
Now when I parse the following:
print (program.parseString ('--1-(-a-3+n)\nx\n') )
The result is as expected:
['-', '-', '1', '-', ['(', '-', 'a', '-', '3', '+', 'n', ')'], '\n', 'x', '\n']
But when the second line can be parsed as tail of the first line, the first \n
is magicked away?
Code:
print (program.parseString ('--1-(-a-3+n)\n-x\n') )
Actual result:
['-', '-', '1', '-', ['(', '-', 'a', '-', '3', '+', 'n', ')'], '-', 'x', '\n']
Expected result:
['-', '-', '1', '-', ['(', '-', 'a', '-', '3', '+', 'n', ')'], '\n', '-', 'x', '\n']
Actually I don't want the parser to automatically join statements.
1. What am I doing wrong?
2. How can I fix this?
3. What is happening under the hood causing this behavious (which surely is sensible, but I just fail to see the point)?
'\n' is normally skipped over as a whitespace character. If you want '\n' to be significant, then you have to call
setDefaultWhitespaceChars
to remove '\n' as skippable whitespace (you have to do this before defining any of your pyparsing expressions):What is happening here is that the parser by default ignores any whitespace. You need to add the following line of code before you define any elements:
The normal default whitespace characters are " \t\r\n", I believe.
Edit: Paul beat me to it. I should have refreshed after getting dinner together. :)