I'm migrating a C#-based programming language compiler from a manual lexer/parser to Antlr.
Antlr has been giving me severe headaches because it usually mostly works, but then there are the small parts that do not and are incredibly painful to solve.
I discovered that most of my headaches are caused by the lexer parts of Antlr, rather than the parser. Then I noticed parser grammar X;
and realized that perhaps I could have my manually written lexer and then an Antlr generated parser.
So I'm looking for more documentation on this topic. I guess a custom ITokenStream could work, but there appears to be virtually no online documentation on this topic...
I found out how. It might not be the best approach but it certainly seems to be working.
ITokenStream
parameterITokenSource
sITokenSource
is a significantly simpler interface thanITokenStream
ITokenSource
to aITokenStream
is to use aCommonSourceStream
, which receives aITokenSource
parameterSo now we only need to do 2 things:
Adjusting the grammar is very simple. Simply remove all lexer declarations and ensure you declare the grammar as
parser grammar
. A simple example is posted here for convinience:Note that the following file will output
class mygrammar
instead ofclass mygrammarParser
.So now we want to implement a "fake" lexer. I personally used the following pseudo-code:
Finally, we need to define
TokenQueue
.TokenQueue
is not strictly necessary but I used it for convenience. It should have methods to receive the lexer tokens, and methods to output Antlr tokens. So if not using Antlr native tokens one has to implement a convert-to-Antlr-token method. Also,TokenQueue
must implementITokenSource
.Be aware that it is very important to correctly set the token variables. Initially, I had some problems because I was miscalculating
CharPositionInLine
. If these variables are incorrectly set, then the parser may fail. Also, the normal channel(not hidden) is 0.This seems to be working for me so far. I hope others find it useful as well. I'm open to feedback. In particular, if you find a better way to solve this problem, feel free to post a separate reply.