pyparsing: how to get token location?

2019-04-15 21:26发布

问题:

I have a simple pyparsing grammar that matches numbers separated by spaces:

from pyparsing import *
NUMBER = Word( nums )
STATEMENT = ZeroOrMore( NUMBER )
print( STATEMENT.parseString( "1 2 34" ) )

Given 1 2 34 test string it returns 3 strings that are parsed tokens. But how do I find the location of each token in the original string? I need it for "kind of" syntax highlighting.

回答1:

Add this parse action to NUMBER:

NUMBER.setParseAction(lambda locn,tokens: (locn,tokens[0]))

Parse actions can be passed the tokens that were parsed for a given expression, the location of the parse, and the original string. You can pass functions to setParseAction with any of these signatures:

fn()
fn(tokens)
fn(locn,tokens)
fn(srctring,locn,tokens)

For your needs, all you need is the location and the parsed tokens.

After adding this parse action, your parsed results now look like:

[(0, '1'), (2, '2'), (4, '34')]