How to match any symbol in ANTLR parser (not lexer

How to match any symbol in ANTLR parser (not lexer)? Where is the complete language description for ANTLR4 parsers?

UPDATE

Is the answer is "impossible"?

标签： parsing antlr grammar antlr4

3条回答

走好不送

2楼-- · 2020-02-12 07:39

It is possible, but only if you have such a basic grammar that the reason to use ANTlr is negated anyway.

If you had the grammar:

text     : ANY_CHAR* ;
ANY_CHAR : . ;

it would do what you (seem to) want.

However, as many have pointed out, this would be a pretty strange thing to do. The purpose of the lexer is to identify different tokens that can be strung together in the parser to form a grammar, so your lexer can either identify the specific string "JSTL/EL" as a token, or [A-Z]'/EL', [A-Z]'/'[A-Z][A-Z], etc - depending on what you need.

The parser is then used to define the grammar, so:

phrase     : CHAR* jstl CHAR* ;
jstl       : JSTL SLASH QUALIFIER ;

JSTL       : 'JSTL' ;
SLASH      : '/'
QUALIFIER  : [A-Z][A-Z] ;
CHAR       : . ;

would accept "blah blah JSTL/EL..." as input, but not "blah blah EL/JSTL...".

I'd recommend looking at The Definitive ANTlr 4 Reference, in particular the section on "Islands in the stream" and the Grammar Reference (Ch 15) that specifically deals with Unicode.

0人赞添加讨论(0) 举报

我想做一个坏孩纸

3楼-- · 2020-02-12 07:42

You first need to understand the roles of each part in parsing:

The lexer: this is the object that tokenizes your input string. Tokenizing means to convert a stream of input characters to an abstract token symbol (usually just a number).

The parser: this is the object that only works with tokens to determine the structure of a language. A language (written as one or more grammar files) defines the token combinations that are valid.

As you can see, the parser doesn't even know what a letter is. It only knows tokens. So your question is already wrong.

Having said that it would probably help to know why you want to skip individual input letters in your parser. Looks like your base concept needs adjustments.

0人赞添加讨论(0) 举报

祖国的老花朵

4楼-- · 2020-02-12 07:42

It depends what you mean by "symbol". To match any token inside a parser rule, use the . (DOT) meta char. If you're trying to match any character inside a parser rule, then you're out of luck, there is a strict separation between parser- and lexer rules in ANTLR. It is not possible to match any character inside a parser rule.

0人赞添加讨论(0) 举报

How to match any symbol in ANTLR parser (not lexer

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间