I want to perform some transformations on C source code. I need a tool on linux that generates a complete AST from the source code so that I can apply my transformations on this AST and then convert it back to the C source code. I tried ELSA but it is not getting compiled. (I am using Ubuntu 8.4). Can anyone suggest a better tool/application?
问题:
回答1:
I would recommend clang. It has a fairly complete C implementation with most gcc extensions, and the code is very understandable. Their C++ implementation is incomplete, but if you only care about generating ASTs from C code that should be fine. Depending on what you want to do you can either use clang as a library and work with the ASTs directly, or have clang dump them out to console.
回答2:
See pycparser - a pure-Python AST generator for C.
回答3:
There are two projects that I'm aware of and that you could find useful:
- CIL
- Transformers
They both parse a standard C source code to allow further analisys and transformation. I've not used them so you have to check for yourself if they fit your needs.
The suggestion of using GCC is also valid, of course. I know there's not much documentation on this aspect of gcc, though.
回答4:
To get AST XML output you can try to use cscan from MarpaX::Languages::C::AST. The output will look like:
xml
<cscan>
<typedef_hash>
<typedef id="GLenum" before="unsigned int" after="" file="/usr/include/GL/gl.h"/>
...
回答5:
www.antlr.org
回答6:
http://ctool.sourceforge.net/
回答7:
Our DMS Software Reengineering Toolkit has been used on huge C systems, parsing, analyzing, transforming, and regenerating C code. Runs on Windows, and will run on Linux under Wine, but it does handle Linux-style (GCC) C code.
I can't emphasize enough the ability to round-trip the C source code: parse, build trees, transform, regenerate compilable C code with the comments and either prettyprinted or with the original programmer's indentation. Few of the other answers here suggest systems that can do that robustly.
The fact that DMS is designed to carry out program transformations (as opposed to other systems suggested in answers here) is also a great advantage. DMS provide tree-pattern matches and rewrites; it augments this with full control and data flow analyis to be used to extend the conditions that you'd like to match. A tool intending to be a compiler is just that, and you'll have a very hard time persuading it not to be a compiler, and an instead to be a transformation engine as the OP requested.
See https://stackoverflow.com/a/2173477/120163 for example ASTs produced by DMS.
回答8:
I've done small amounts of work on source-to-source transformations and I found CIL to be very powerful for this task. CIL has the advantage of being a framework specifically designed for static source analysis and transformation. It can also process code with any amount of ugly GCC specific extensions(It's been used to process the Linux kernel, as one example.) Unfortunately, it is written in OCAML, and analyses/transformations built using it must also be writtne in OCAML, which might be problematic if you've never used it.
Alternatively, clang is supposed to have a relatively easily-hackable codebase and it can certainly be used to produce C AST's.
回答9:
How about taking gcc and writing a custom backend for it? I've never done it nor even worked on gcc source code, so I don't know how hard it would be.
回答10:
You can try generate AST (Abstract Syntax Tree) using Lexx and Yacc on Linux:
lex and yacc
from lex and yacc to ast
回答11:
"I tried ELSA but it is not getting compiled. (I am using Ubuntu 8.4)"
The Elkhound and Elsa source code, version 2005.08.22b from scottmcpeak.com/elkhound/ is outdated (old C++ style .h header files).
Elsa is working and part of Oink: http://www.cubewano.org/oink/#Gettingthecode I have just got it working now under Ubuntu 9.10.