How to implement error handling in ANTLR4

2020-07-09 10:09发布

问题:

I have the following grammar for parsing first-order logic formulas applied over graphs:

grammar Graph;


/*------------------------------------------------------------------
* PARSER RULES
*------------------------------------------------------------------*/

input
: 
formula EOF
;

formula
:
TRUE 
| FALSE 
| formula AND formula
| formula OR formula
| quantifier formula
| ST condition
;


condition
:
atom EQUALS QUOTE? (assignment | atom) QUOTE?
;

quantifier 
:
(FOREACH | EXISTS) variable IN domain
;

domain
:
(GRAPH_A | GRAPH_B)
;

atom
:
variable DOT property
;


variable
:   
(nodev | edgev)
;


nodev
:
(NODE | NODE1)
;

edgev
:
(EDGE | EDGE1)
;

property
:
(COLOR | VALUE)
;

assignment
:
(COLORTYPE | NUMBER)
;


/*------------------------------------------------------------------
* LEXER RULES
*------------------------------------------------------------------*/

TRUE : 'True' ;

FALSE : 'False' ;

AND : '&' ;

OR : '|' ;

ST : '->' ; 

EXISTS  : 'Exists' ;

FOREACH : 'Foreach' ;

NODE : 'node' ;

NODE1 : 'node1' ;

EDGE : 'edge' ;

EDGE1 : 'edge1' ;

IN : 'in' ;

GRAPH_A : 'GraphA' ;

GRAPH_B : 'GraphB' ;

EQUALS : '=' ;

DOT : '.' ;

COLOR : 'color' ;

VALUE : 'value' ;

NUMBER : ('0'..'9')+ (DOT ('0'..'9')+)? ;

QUOTE : '\'' ;

COLORTYPE : ('a'..'z')+ ;

WS : [ \t\r\n]+ -> skip ;

I believe that this is the final version of my grammar so now I want to specify some error handling for the input. The problem is that I don't know how. What I know is that after I parse the input I can iterate over the generated AST and this is the place to add the error handling.

If the parse fails, return the parse exception; otherwise, I have specified the following cases for returning error message.

  1. There can't be 1 quantifier followed by -> condition (this is a formula element) where condition is equal to atom=atom. In other words, if there is only quantifier then condition should be equal to atom EQUALS assignment.

  2. If there are 2 quantifiers the first should start with FOREACH

  3. The variable(s) in the quantifiers should be used in the condition statement

  4. There can't be more than two quantifiers on the left hand-side of the expression (because in the app I am developing there are only two graphs). So if the number of quantifiers is greater then two return error as well

  5. If there are 2 quantifiers then they should have different variables bound on them

For example, the first case should be raised when as an input we have

Exists node in GraphA -> node.color = node1.color 

because node1 isn't specified in the left hand-side of the expression.

An example for the second case would be the following input

Exists node in GraphA Exists node1 in GraphB -> node.color = node1.color

So my question is do I have to implement all the error checking over the generated parse tree or I can specify some of it in the grammar using some java code. If the error handling should happen after the input is parsed what functionality of ANTLR 4 can I use in order to implement the error cases? Any help or advice would be much appreciated!

回答1:

You'll probably want to implement these semantic checks in a listener, in combination with helper visitors. Here are some examples.

Rule: The variables in the quantifier should be used in the condition

Implementation:

  1. Create a visitor that returns the variables used by a specified parse tree node. You'll want to override defaultResult() and aggregateResult(T, T) to do most of the work for you, and then override visitNodev and visitEdgev to handle the specific variables in use.
  2. In your listener, override the enterFormula method. In this method, if ctx.quantifier() is not null, then use your visitor to get the list of variables declared in ctx.quantifier() and the list of variables used in ctx.formula().
  3. Report an error as appropriate based on the two results.

Rule: If there are 2 quantifiers then they should have different variables bound on them

Implementation:

  1. Start with the visitor described in step 1 of the previous rule implementation.
  2. In your listener, override the enterFormula method. In the method, if ctx.quantifier() is not null, then you need to get a collection of all the other QuantifierContext instances underneath the tree returned by ctx.formula(). You can do this by calling XPath.findAll(ctx.formula(), "//quantifier", parser).
  3. Use the visitor described above to gather a list of variables declared in each of the linked QuantifierContext instances. If any of the sets overlap, report an error as appropriate.

Rule: If there are 2 quantifiers the first should start with FOREACH

Implementation:

Use the listener pattern described in the previous step to locate cases where a formula contains more than one quantifier. If the first of these formulas has ctx.quantifier().FOREACH() == null emit an error as appropriate.

Rule: There can't be more than two quantifiers...

Implementation:

Update the implementation of the 2nd rule above to report an error if XPath.findAll returns more than one QuantifierContext in the formula for a quantifier.

Rule: Quantifier condition restriction

Implementation:

First, create a ParseTreePattern object.

String patternString = "<quantifier> -> <condition>";
ParseTreePattern pattern =
    parser.compileParseTreePattern(patternString, GraphParser.RULE_formula);

Then, find all instances of this pattern in your parse tree.

List<ParseTreeMatch> matches = pattern.findAll(tree, "//formula");

Validating the matches is then rather simple.

for (ParseTreeMatch match : matches) {
  ConditionContext condition = (ConditionContext)match.get("condition");
  if (condition.assignment() == null) {
    // TODO: report error here
  }
}