any C/C++ refactoring tool based on libclang? (eve

2019-01-12 17:29发布

问题:

As I've pointed out - here - it seems clang's libclang should be great for implementing the hard task that is C/C++ code analysis and modifications (check out video presentation and slides).

Do you know of any C/C++ refactoring tool based on libclang ?

"Any" includes even simple alpha state project, with support of one refactoristation technique. It can be without preprocessor support. As an example of the functionally about which I'm talking: changing method names, whether it supports multiple files or only one file at a time. You might be wondering what the goal is of asking for even small working examples My thought is that creating a list of code examples and small tools that are in one place will provide a better resource to learn how to implement refactorisation with libclang. I believe that from simple projects might grow bigger projects - in a proper opensource manner :).

回答1:

Clang contains a library called "CIndex" which was developed, I believe, for doing code completion in IDEs. It can also be used for parsing C++ and walking the AST, but doesn't have anything in the way of refactoring. See Eli Bendersky's article here.

I have started such a project recently: cmonster. It's a Python-based API for parsing C++ (using libclang), analyzing the AST, with an interface for "rewriting" (i.e. inserting/removing/modifying source ranges). There's no nice way (yet) for doing things like modifying function names and having that translated into source-modifications, but it wouldn't be terribly difficult to do that.

I have not yet created a release with this functionality (although it's in the github repo), as I'm waiting for llvm/clang 3.0 to be released.

Also, I should point out a couple of things:

  • The code is very rough, calling it alpha would be perhaps generous.
  • I'm by no means an expert on this subject (unlike, say, Dr. Ira Baxter over there).

Adjust expectations appropriately.

Update: cmonster 0.2 has been released, which includes the described features. Check it out on Github.



回答2:

Google have been working on a tooling library for Clang. In since the 3.2 release. It includes a ASTMatchers library so you can just build up a query and don't have to walk the AST.

There is a great video talk on the subject that walks through a simple rename example. (This is from the same guy as the MapReduce talk posted above but is newer and more about a simple practical implementation rather than the internal design and enterprise scale stuff Google have going on).

The source for that example that renames a method is available in the tooling branch. It may be somewhere in the trunk but I can't find it. Also Rename the getDeclAs function to getNodesAs as the other is apparently deprecated.). There is a more advanced example that removes duplicated c_str calls (which is in trunk and someone posted above).

Here is documentation for LibASTMatchers and LibTooling.

EDIT: Some better docs for ASTMatcher. Here and here.

EDIT: Google are now working on something called Clangd which aims to be some kind of Clang server for refactoring.



回答3:

Google made a Clang based refactoring tool for their C++ code base and plans to release it. I don't know the current state of the project, but you can see this demo presented on the 2011 LLVM Developers Meeting: https://www.youtube.com/watch?v=mVbDzTM21BQ.

Also, XCode's (4+) built-in auto-completion and refactoring functions are based on libclang.



回答4:

This may be a bit 'meta', but there's an example thats written in clang as a tool to run on clang (although, there's more to it than just that.

RemoveCStrCalls.cpp

//  This file implements a tool that prints replacements that remove redundant
//  calls of c_str() on strings.
//
//  Usage:
//  remove-cstr-calls <cmake-output-dir> <file1> <file2> ...
//
//  Where <cmake-output-dir> is a CMake build directory in which a file named
//  compile_commands.json exists (enable -DCMAKE_EXPORT_COMPILE_COMMANDS in
//  CMake to get this output).
//
//  <file1> ... specify the paths of files in the CMake source tree. This path
//  is looked up in the compile command database. If the path of a file is
//  absolute, it needs to point into CMake's source tree. If the path is
//  relative, the current working directory needs to be in the CMake source
//  tree and the file must be in a subdirectory of the current working
//  directory. "./" prefixes in the relative files will be automatically
//  removed, but the rest of a relative path must be a suffix of a path in
//  the compile command line database.
//
//  For example, to use remove-cstr-calls on all files in a subtree of the
//  source tree, use:
//
//    /path/in/subtree $ find . -name '*.cpp'|
//        xargs remove-cstr-calls /path/to/source


回答5:

https://github.com/lukhnos/refactorial is based on clang and claims

Transforms Provided

Accessor: Synthesize getters and setters for designated member variables

MethodMove: Move inlined member function bodies to the implementation file

ExtractParameter: promote a function variable to a parameter to that function

TypeRename: Rename types, including tag types (enum, struct, union, class), template classes, Objective-C types (class and protocol), typedefs and even bulit-in types (e.g. unsigned to uint32_t)

RecordFieldRename: Rename record (struct, union) fields, including C++ member variables

FunctionRename: Rename functions, including C++ member functions

Works via specifications in a YAML configuration file. I haven't tried it out (yet).



回答6:

Not open source, but has been used to carry out very non-toy massive automated refactoring of C++ programs: our DMS Software Reengineering Toolkit. DMS is a "library" (we called it a "toolkit") of facilities on can compose to achieve anlaysis and/or automated translation.

Relevant to C++, DMS provides at this point in time:

  • Full C++11 parser, constructing the AST and able to regenerate source code accurately including comments, with a complete preprocessor
  • Full C++ parser with name and type resolution for C++ (ANSI, GNU, MS Visual C++)
  • Control flow analysis for C++
  • Source-to-source transformations
  • Partially complete "rename" machinery (see discussion below)

What I can say from experience is that C++ is a bitch of a language to transform.

We continue to work on it, and are completing a reliable renaming tool. Even this is hard; a key problem is the name-shadowing problem. You have a local variable X, and a reference to Y inside that scope; you attempt to rename Y to X and discover that the local variable "captures" the access. It is amazing how many namespaces and capture types you have to worry about in C++. And this is needed as a foundation for many other refactorings.

EDIT Feb 2014: Full C++14 parser, control flow analysis, local data flow analysis



回答7:

Another possibility is to develop your own plugin for GCC, or to develop a GCC MELT extension to do your task. But extending GCC (or Clang) requires understanding the internal representations of these compilers (Gimple & Tree for GCC) and this require some work. MELT is a high-level domain specific language to extend GCC.



回答8:

It's not refactoring, but completion, but might be useful:

  • clang complete : Use of Clang for completing C, C++, Objective-C and Objective-C++