I am looking for a parser generator for Java that does the following: My language project is pretty simple and only contains a small set of tokens.
Output in pure READABLE Java code so that I can modify it (this why I wouldn't use ANTLR) Mature library, that will run and work with at least Java 1.4
I have looked at the following and they might work: JavaCC, jlex, Ragel?
I had good experience SableCC.
It works different from most generators, in that you're given a AST/Visitor model that you extend (via inheritance).
I can't comment on the "quality" of its code in terms of readability (it's been a while since I've used it), but it does have the quality that you don't have to read the code at all. Just the code in your subclass.
Maybe ANTLR will do it for you. It's a nice parser generator with a fine book available for documentation.
Take a look at SableCC. Sablecc is an easy to use parser generator that accepts the grammar of your language as EBNF, without intermingling action code, and generates a Java parser that produces a syntax tree which can be traversed using a tree node visitor. SableCC is powerful, yet much simpler to use than ANTLR, JavaCC, yacc, etc. It also does not require a separate lexer. Constructing your language processor amounts to extending a visitor class generated from your grammar, and to overriding its methods which are called upon when a syntactic construct is encountered by the parser. For every grammar rule XYZ, the visitor will have a method inAXYZ(Node xyz)....outAXYZ(Node xyz) called upon when the parser matches the rule.
For a language that simple, JFlex might suffice. It's similar to JLex but faster (which might also mean less readable, but I've not seen JLex's output).
It is a lexer, not a parser, but it is built to interface easily with CUP or BYacc/J. And again, for a simple language, it might be easier to just write your own parser (I've done this before).
We are using JavaCC for our (as well rather small language) and are happy with it.
You should use Rats... This way, you don't have to separate lexer and parser and then if you want to extend your project that will be trivial. It's in java and then you can process your AST in Java...