How to create a antlr4 grammar which will parse da

2019-01-29 07:58发布

I want to parse few date format using following ANTLR4 grammar.

grammar Variables;
//varTable : tableNameFormat dateFormat? ;
//tableNameFormat: (ID SEPERATOR);
dateFormat : YEAR UNDERSCORE MONTH UNDERSCORE TODAY
       | YEAR
       ;
YEAR : DIGIT DIGIT DIGIT DIGIT;                         // 4-digits YYYY
MONTH : DIGIT DIGIT;                                    // 2-digits MM
TODAY : DIGIT DIGIT ;                                     // 2-digits DD
UNDERSCORE: ('_' | '-' );
fragment
DIGIT : [0-9] ;
ID : [a-zA-Z][a-zA-Z0-9]? ;
WS  : [ \t\r\n]+ -> skip ;

This grammar should parse "2016-01-01" easily but it's giving input mismatch. Please help

标签: antlr4
3条回答
老娘就宠你
2楼-- · 2019-01-29 08:07

I never worked on Antlr before, but when I looked in GitHub if someone already did which I want. Found this library.

here is a library to parse the date from String.

https://github.com/masasdani/nangka

add this project as a dependency of your project

   <dependency>
        <groupId>com.masasdani</groupId>
        <artifactId>nangka</artifactId>
        <version>0.0.6</version>
    </dependency>

Sample usage :

  String exprEn = "a month later, 20-11-90";
    Nangka nangka = new Nangka();
    DateUnit dateUnit = nangka.parse(exprEn);
    for(Date date : dateUnit.getRelatedDates()){
        System.out.println(date);
    }

Hope this helps someone like me who is searching.

查看更多
该账号已被封号
3楼-- · 2019-01-29 08:14

For such a task regex is much better solution. But if you have it as a study project, here it is...

It is important to realize that order of lexer rules is crucial. Input will be tested by these rules and the first applicable will be used. The rules should be written from the most specific to avoid conflicts. For example, if you have grammar with variable names and some keywords, keywords should be first otherwise they will be marked as variables.

There are many ways you can solve this, but the best would be one lexer rule named DATE : NUM NUM NUM NUM '-' NUM NUM '-' NUM NUM; Month and Day rules as you have them wont work, as they are ambigous. How can lexer tell if two numbers input is month or day?

查看更多
Emotional °昔
4楼-- · 2019-01-29 08:17

In my case it works. I am getting a correct parsetree with the input: 2016-01-01

grammar date;

dateFormat : year UNDERSCORE month UNDERSCORE today
       | year
       ;

year : DIGIT DIGIT DIGIT DIGIT
     ;

month : DIGIT DIGIT
      ;

today : DIGIT DIGIT 
      ;

UNDERSCORE: ('_' | '-' );
DIGIT : [0-9] ;

But I would use for month something like (0 [1-9] | 1 [0-2]) because there are only 12 months.

查看更多
登录 后发表回答