From what I have searched on Google so far, the only difference between LALR(1) and SLR(1) is that LALR(1) uses states with merged lookahead sets. One source says that LALR(1) won't run into shift-reduce conflicts because the parser will "remember" from which state it arrived at the current state. However, there are no differences in the parsing algorithm, so I'm having trouble seeing how these merged lookahead sets would help the LALR parser resolve shift-reduce conflicts. And if the parser remembers from which state it arrived at the current state, why is it still vulnerable to reduce-reduce conflicts?
相关问题
- Correctly parse PDF paragraphs with Python
- R: eval(parse()) error message: cannot ope
- How do I parse a .pls file using PHP? Having troub
-
Create array from the contents of 相关文章
- How do I get from a type to the TryParse method?
- Slow ANTLR4 generated Parser in Python, but fast i
- Parsing JSON in QML [duplicate]
- How do I generate an AST from a string of C++ usin
- JSoup will not fetch all items?
- Content is not allowed in prolog
- How to manage parsing an null object for DateTime
- Make Gson throw exception on parsing JSON with dup
It is actually SLR which uses merged lookahead sets. Or, more accurately, the lookahead set for each production in an SLR parse table is approximated as the FOLLOW set for the last symbol in the production, effectively ignoring the context.
LALR parsers have merged states with respect to canonical LR parsers. In both LR and LALR, in contrast to SLR, tbe lookahead sets are computed accurately, but in the case of the LALR parser, the states are merged so that it has the same states (but not the same lookahead sets) as tbe SLR parser.
There is a reasonable discussion of the differences in the Wikipedia articles on SLR parsing, LALR parsing and Canonical LR parsing. See the references for those articles for more information.
No matter which parser-generation technique you use, there will be conflicts if the grammar requires more lookahead. For example, the following grammar requires two lokahead symbols to decide whether to shift or reduce a
NAME
:Here, a NAME is part of a
defn
if it is not followed by a colon, so to decide whether or not to shift a NAME, you need not just the NAME, which is the lookahead token, but also tge following token, the second lookahead.Here is a very simple grammar which is not SLR(1), which might possibly illuminate the problem with using follow sets. (It comes straight out of the Dragon book, and is often used as an example of why SLR(1) grammars are inadequate):
Here, the FOLLOW sets of R and L are identical. Both R and L include
=
, since* R = ...
is valid, and will reduce toL = ...
, and clearly both R and L can appear at the end of an S, so both FOLLOW sets include the end-of-input marker.The problem then is to decide whether or not to reduce
R → L
. InL = R
, we should leave theL
as is and shift the=
. But the FOLLOW sets don't help, becauseR
can be followed by=
. So in an SLR(1) grammar, which only uses FOLLOW sets, there is a shift/reduce conflict in the state:This conflict doesn't exist in the LALR(1) grammar because in that state the lookahead set for the production
R → L
does not include=
, because the production was included from the itemS → • R
, whose lookahead is the end-of-input marker.