How to generate AST from Java source-code? [closed

2019-01-10 10:28发布

问题:

As far as I know, the only way to parse Java source-code into an AST (Abstract Syntax Tree) is to use the Java Compiler Tree API: com.sun.source.tree

I have two questions:

  1. What JDKs support com.sun.source.tree?
  2. Is there a portable replacement that works for all JDKs?

回答1:

You can possibly take the tools.jar and use it. javac is open source so you can just grab that code (assuming you can deal with the license). Antlr has grammars for Java as well.



回答2:

Regarding your second question, there are dozens of Java parsers available in addition to Sun's. Here is a small sample:

  • Eclipse's org.eclipse.jdt.core.dom package.
  • Spoon outputs a very nice annotated parse tree with type information and variable binding (and uses Eclipse's parser internally)
  • ANTLR is a parser-generator, but there are grammars for Java available
  • javaparser (which I have not used)

My best advice is to try each of them to see which works best for your needs.



回答3:

I've used Eclipse's AST parser. I found it to be pretty good (well it was part of an Eclipse plug-in so it did make sense to use it). See Exploring Eclipse's ASTParser.



回答4:

A working, simple to use Java Parser is... JavaParser. The project has been active for some years already. While it was initially hosted on Google code it is now available on GitHub: https://github.com/javaparser/javaparser

It is quite simple to use and the number of dependencies is small. It is also available on Maven.

It has been used for a few years, so it works quite well and permits to parse also comments, to change the AST and regenerate the code.



回答5:

I've just come across Jexast, an extraction of the JDT's ASTParser to work independent of Eclipse (it depends on org.eclipse.jdt.internal.compiler.**).

I haven't tried it yet, but it does seem interesting.



回答6:

It is not the only way.

See our Java Front End, which is a full featured Java parser built on top of the DMS Software Reengineering Toolkit. It parses Java, and builds ASTs as internal data structures.

The point of DMS is that it provides a huge variety of additional useful machinery (attribute grammars, symbol tables, flow analysis, AST manipulation including access and update, as well as source-to-source transformations) to analyze and transform that AST into results and/or modified source code. If you get "just" a Java parser (e.g., JavaCC + Java grammar) you will, IMHO, not be able to do a lot with it. DMS makes it possible to do a lot, without having to invent all that extra machinery yourself.

If you really don't want to use the extra machinery DMS provides, it will dump the tree as XML.