What is packrat parsing?

2020-05-19 07:03发布

问题:

I know and use bison/yacc. But in parsing world, there's a lot of buzz around packrat parsing.

What is it? Is it worth studing?

回答1:

Packrat parsing is a way of providing asymptotically better performance for parsing expression grammars (PEGs); specifically for PEGs, linear time parsing can be guaranteed.

Essentially, Packrat parsing just means caching whether sub-expressions match at the current position in the string when they are tested -- this means that if the current attempt to fit the string into an expression fails then attempts to fit other possible expressions can benefit from the known pass/fail of subexpressions at the points in the string where they have already been tested.



回答2:

Pyparsing is a pure-Python parsing library that supports packrat parsing, so you can see how it is implemented. Pyparsing uses a memoizing technique to save previous parse attempts for a particular grammar expression at a particular location in the input text. If the grammar involves retrying that same expression at that location, it skips the expensive parsing logic and just returns the results or exception from the memoizing cache.

There is more info here at the FAQ page of the pyparsing wiki, which also includes links back to Bryan Ford's original thesis on packrat parsing.