How to parse/simple analyze C/C++ code from C# to

2019-04-07 17:08发布

I need to go through a C/C++ file and extract the list of classes and methods and where they're located on the file.

Is libclang the best option? Or is it "too much" for the task?

Would it be better to just look for pairing brackets?

In case libclang is the choice: is there a way to invoke it from c#?

Thanks!

6条回答
贼婆χ
2楼-- · 2019-04-07 17:20

You could consider ctags, available on many platforms. The output is easily parsable, and full of info you required.

more info For your question, I had to look to the many options available, and after a little I found it. For example:

ctags -N -x --c-kinds=+p crowd.*

produces this output

CrowdSim         class        44 crowd.h          class CrowdSim
CrowdSim         function     47 crowd.h          CrowdSim( const std::string& contentDir ) : _contentDir( contentDir ) {}
Particle         function     35 crowd.h          Particle()
Particle         struct       25 crowd.h          struct Particle
_contentDir      member       56 crowd.h          std::string _contentDir;
_crowd_H_        macro        18 crowd.h          #define _crowd_H_
_particles       member       57 crowd.h          std::vector< Particle > _particles;
animTime         member       32 crowd.h          float animTime;
chooseDestination function     24 crowd.cpp        void CrowdSim::chooseDestination( Particle &p )
chooseDestination prototype    53 crowd.h          void chooseDestination( Particle &p );
dx               member       28 crowd.h          float dx, dz; // Destination position
dz               member       28 crowd.h          float dx, dz; // Destination position
fx               member       29 crowd.h          float fx, fz; // Force on particle
fz               member       29 crowd.h          float fx, fz; // Force on particle
init             function     35 crowd.cpp        void CrowdSim::init()
init             prototype    49 crowd.h          void init();
node             member       31 crowd.h          H3DNode node;
ox               member       30 crowd.h          float ox, oz; // Orientation vector
oz               member       30 crowd.h          float ox, oz; // Orientation vector
px               member       27 crowd.h          float px, pz; // Current postition
pz               member       27 crowd.h          float px, pz; // Current postition
update           function     68 crowd.cpp        void CrowdSim::update( float fps )
update           prototype    50 crowd.h          void update( float fps );

(note: -x is only for easy user inspection)

查看更多
我想做一个坏孩纸
3楼-- · 2019-04-07 17:24

To do this well, you really need something that contains a full C++ parser.

Our DMS Software Reengineering Toolkit with its C++ Front End could be used for this. It can provide both the precise entity declarations including types, and their context (class/namespace/...) and precise file positions. DMS provides access to all this inforamtion as a set of ASTs and related symbol tables; you build custom code to navigate to/take what you want.

Depending on your needs, you may find that the information you want is difficult to process using vanilla C#. The type information in its full glory is pretty complex, because C++ is a complex language. If you want to process that information, you'll want to "stay inside" DMS where all the machinery to do that is present. If all you want is the names and type information as text strings, you can get DMS to prettyprint this data in that form; it has standard libraries supporting such activities. An intermediate answer would be to export the data in XML format; DMS provides direct support for exporting arbitrary AST fragments but only indirect support for writing type information out as XML, but it wouldn't be hard to customize.

EDIT: (in response to OP comment in another answer) DMS can provide precise information both about the method signature, and the method body. It has full AST and type information for both.

查看更多
ら.Afraid
4楼-- · 2019-04-07 17:32

Not sure what is the best option, but you could take a look at GCC-XML or Mono/CXXI as well. The latter one uses GCC-XML internally, but also provides C# interfaces to the C++ classes definitions.

libclang is a C library and thus should be usable from .NET via P/Invoke, but it might be quite tedious to repeat all necessary declarations in C#.

查看更多
Luminary・发光体
5楼-- · 2019-04-07 17:35

It's better to use a full parser IMO. You can use ANTLR. It has both C/C++ grammar and C# parser generator.

查看更多
一纸荒年 Trace。
6楼-- · 2019-04-07 17:41

Another angle would be to create an extension for Visual Studio.

查看更多
▲ chillily
7楼-- · 2019-04-07 17:45

If you want to use Clang, I recommend you take a look at this page. It demonstrates how to get all virtual methods from a file. Once you understand this simple example, you can create more complex so called matchers.

查看更多
登录 后发表回答