I wrote a grammar with antlr 4.4 like this :
grammar CSV;
file
: row+ EOF
;
row
: value (Comma value)* (LineBreak | EOF)
;
value
: SimpleValueA
| QuotedValue
;
Comma
: ','
;
LineBreak
: '\r'? '\n'
| '\r'
;
SimpleValue
: ~(',' | '\r' | '\n' | '"')+
;
QuotedValue
: '"' ('""' | ~'"')* '"'
;
then I use antlr 4.4 for generating parser & lexer, this process is successful
after generate classes I wrote some java code for using grammar
import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.CommonTokenStream;
public class Main {
public static void main(String[] args)
{
String source = "\"a\",\"b\",\"c";
CSVLexer lex = new CSVLexer(new ANTLRInputStream(source));
CommonTokenStream tokens = new CommonTokenStream(lex);
tokens.fill();
CSVParser parser = new CSVParser(tokens);
CSVParser.FileContext file = parser.file();
}
}
all of above code is a parser for CSV strings for example : ""a","b",c"
Window Output :
line 1:8 token recognition error at: '"c'
line 1:10 missing {SimpleValue, QuotedValue} at '<EOF>'
I want to know How I can get this errors from a method (getErrors() or ...) in code-behind not as result of output window
Can anyone help me ?
Using ANTLR for CSV parsing is a nuclear option IMHO, but since you're at it...
ANTLRErrorListener
. You may extendBaseErrorListener
for that. Collect the errors and append them to a list.parser.removeErrorListeners()
to remove the default listenersparser.addErrorListener(yourListenerInstance)
to add your own listenerNow, for the lexer, you may either do the same thing
removeErrorListeners
/addErrorListener
, or add the following rule at the end:With this rule, the lexer will never fail (it will generate
UNKNOWN_CHAR
tokens when it can't do anything else) and all errors will be generated by the parser (because it won't know what to do with theseUNKNOWN_CHAR
tokens). I recommend this approach.