-->

Lemon Parser: This rule can not be reduced

2019-08-16 03:24发布

问题:

I'm attempting to write a grammar to parse templating language say jinja2 (or twig at your choose), and I can't successfully parse switch-case statement.

Let me show desired syntax:

{% switch username %}
    {% case "Jim" %}
        I want to say: 
    {% case "Nik" %}
        Hello man!
    {% endcase %}
    {% case "Bob" %}
        Hi
    {% default %}
        Who are you?
{% endswitch %}

Here endcase just works as break.

Worked portion of my grammar file:

program ::= template_language(Q) . {
    status->ret = Q;
}

template_language(R) ::= statement_list(L) . {
    R = L;
}

statement_list(R) ::= statement_list(L) statement(S) . {
    R = my_list(L, S);
}

statement_list(R) ::= statement(S) . {
    R = my_list(NULL, S);
}

statement(R) ::= switch_statement(E) . {
    R = E;
}

// empty {% switch expr %} {% endswitch %}
switch_statement(R) ::= OPEN_DELIMITER SWITCH expr(E) CLOSE_DELIMITER OPEN_DELIMITER ENDSWITCH CLOSE_DELIMITER . {
    R = my_switch_statement(E, NULL, status->scanner_state);
}

switch_statement(R) ::= OPEN_DELIMITER SWITCH expr(E) CLOSE_DELIMITER case_clauses(C) OPEN_DELIMITER ENDSWITCH CLOSE_DELIMITER . {
    R = my_switch_statement(E, C, status->scanner_state);
}

case_clauses(R) ::= case_clauses(C) case_clause(K) . {
    R = my_list(C, K);
}

case_clauses(R) ::= case_clause(K) . {
    R = my_list(NULL, K);
}

// empty {% case expr %} {% endcase %}
case_clause(R) ::= OPEN_DELIMITER CASE expr(E) CLOSE_DELIMITER OPEN_DELIMITER ENDCASE CLOSE_DELIMITER . {
    R = case_clause(E, NULL, status->scanner_state);
}

case_clause(R) ::= OPEN_DELIMITER CASE expr(E) CLOSE_DELIMITER statement_list(T) OPEN_DELIMITER ENDCASE CLOSE_DELIMITER . {
    R = case_clause(E, T, status->scanner_state);
}

This is just part of my grammar and I have worked grammar for for, if, while, do, loop, etc.

But I have no idea about:

  1. {% case expr %} statement_list(T) without {% endcase %}
  2. {% default %} statement_list(T)

For example I tried to use:

case_clause(R) ::= OPEN_DELIMITER CASE expr(E) CLOSE_DELIMITER statement_list(T) . {
    R = case_clause(E, T, status->scanner_state);
}

for #1 but no luck, got

This rule can not be reduced.

The same for #2

Frankly speaking I'm understand the root of problem - the lack of case/default-bound, but actually I have no idea how to address this issue.

Any help would be greatly appreciated!

回答1:

The problem is that your grammar is LR(2), not LR(1).

Many shift/reduce conflicts arise because it is impossible to know what to do until the token following a %{ is seen. For example, consider the partial template (I've deliberately destroyed the indentation):

{% switch username %} {% case "Jim" %} I want to say: {%

Now, should the bolded section be reduced to case_clause?

Remember that in an LR(k) grammar, the decision to reduce must be made by looking only at the k tokens following the end of the sequence to be reduced. Lemon, like most LR parser generators, only implements LR(1) parsers, so the decision needs to be made using only one lookahead token, which is the %}. But the decision cannot be made without knowing what the next token is:

{% switch username %} {% case "Jim" %} I want to say: {% endcase
{% switch username %} {% case "Jim" %} I want to say: {% case
{% switch username %} {% case "Jim" %} I want to say: {% switch

In the first input, we do not do the reduction, but we have reached the end of the statement_list. In the second one, we need to reduce because we've found the entire case_clause. And in the third one, we've started a new statement which will need to be appended to the statement_list.

The first and third possible inputs present no problem, because both of them just correspond to a shift action; the necessary reduction -- whichever one it is -- will be performed later. But the second one needs a reduce to happen before the %{, and by the time we see the case token, it is too late.

The simplest solution, it seems to me, is to force the lexer to recognize {% keyword as a single token (different for each keyword). For example, the following, which differs from your grammar only in that each instance of OPEN_DELIMITER FOO has been replaced by OPEN_FOO, presents no problems: (I also replace CLOSE_DELIMITER with CLOSE to avoid horizontal scrolling.)

program ::= template_language(Q) . {
    status->ret = Q;
}

template_language(R) ::= statement_list(L) . {
    R = L;
}

statement_list(R) ::= statement_list(L) statement(S) . {
    R = my_list(L, S);
}

statement_list(R) ::= statement(S) . {
    R = my_list(NULL, S);
}

statement(R) ::= switch_statement(E) . {
    R = E;
}

// empty {% switch expr %} {% endswitch %}
switch_statement(R) ::= OPEN_SWITCH expr(E) CLOSE OPEN_ENDSWITCH CLOSE . {
    R = my_switch_statement(E, NULL, status->scanner_state);
}

switch_statement(R) ::= OPEN_SWITCH expr(E) CLOSE case_clauses(C) OPEN_ENDSWITCH CLOSE . {
    R = my_switch_statement(E, C, status->scanner_state);
}

case_clauses(R) ::= case_clauses(C) case_clause(K) . {
    R = my_list(C, K);
}

case_clauses(R) ::= case_clause(K) . {
    R = my_list(NULL, K);
}

// empty {% case expr %} {% endcase %}
case_clause(R) ::= OPEN_CASE expr(E) CLOSE OPEN_ENDCASE CLOSE . {
    R = case_clause(E, NULL, status->scanner_state);
}

case_clause(R) ::= OPEN_CASE expr(E) CLOSE statement_list(T) OPEN_ENDCASE CLOSE . {
    R = case_clause(E, T, status->scanner_state);
}

case_clause(R) ::= OPEN_CASE expr(E) CLOSE statement_list(T) . {
    R = case_clause(E, T, status->scanner_state);
}


case_clause(R) ::= OPEN_DEFAULT CLOSE statement_list(T) . {
    R = case_clause(E, T, status->scanner_state);
}

As a side-note, I'd suggest simplifying your grammar by not special-casing empty statement-lists. Just allow an empty base case for statement_list:

program ::= template_language(Q) . {
    status->ret = Q;
}

template_language(R) ::= statement_list(L) . {
    R = L;
}

statement_list(R) ::= statement_list(L) statement(S) . {
    R = my_list(L, S);
}

statement_list(R) ::= . {
    R = NULL;
}

statement(R) ::= switch_statement(E) . {
    R = E;
}

switch_statement(R) ::= OPEN_SWITCH expr(E) CLOSE case_clauses(C) OPEN_ENDSWITCH CLOSE . {
    R = my_switch_statement(E, C, status->scanner_state);
}

case_clauses(R) ::= case_clauses(C) case_clause(K) . {
    R = my_list(C, K);
}

case_clauses(R) ::= . {
    R = NULL;
}

case_clause(R) ::= OPEN_CASE expr(E) CLOSE statement_list(T) OPEN_ENDCASE CLOSE . {
    R = case_clause(E, T, status->scanner_state);
}

case_clause(R) ::= OPEN_CASE expr(E) CLOSE statement_list(T) . {
    R = case_clause(E, T, status->scanner_state);
}

case_clause(R) ::= OPEN_DEFAULT CLOSE statement_list(T) . {
    R = case_clause(E, T, status->scanner_state);
}