Tool to Scan Code Comments, and convert to Standar

2019-07-03 22:45发布

I'm working on a C project that has seen many different authors and many different documentation styles.

I'm a big fan of doxygen and other documentation generations tools, and I would like to migrate this project to use one of these systems.

Is anybody aware of a tool that can scan source code comments for keywords like "Description", "Author", "File Name" and other sorts of context to intelligently convert comments to a standard format? If not I suppose I could write a crazy script, or convert manually.

Thanks

3条回答
冷血范
2楼-- · 2019-07-03 23:00

If you have relatively limited set of styles to parse, it would be fairly simple to write a Visual Studio macro (for use in the IDE) or a standalone application (for just processing the source code 'offline') that will search a file for comments and then reformat them into a new style using certain titles or tags to split them apart.

A shortcut that may help you is to use my AtomineerUtils Pro Documentation add-in. It can find and convert all the comments in a source file in one pass. Out of the box it parses XML Documentation, Doxygen, JavaDoc and Qt formats (or anything sufficiently close to them) and can then output the comment in any of those formats. It can also be configured to convert incompatible legacy comments. There are several options to aid conversion, but the most powerful calls a Visual Studio Macro with the comment text before it parses it, allowing you to apply a bit of string processing to convert legacy comments into a format that AtomineerUtils can subsequently read (an example macro for one of the most commonly used legacy styles is supplied on the website, so it's usually pretty simple to modify this to cope with your legacy format, as long as it's suitable for a computer to parse).

The converted text need not be particularly tidy - Once AtomineerUtils can extract the documentaiton entries, it will clean up the comments for you - it optionally applies word wrapping, consistent element ordering and spacing etc automatically, and ensures that the comment accurately describes the code element it documents (its entries match the params, typeparams, exceptions thrown etc) and then outputs a replacement comment in its configured format. This saves you doing a lot of work in the conversion macro to get things tidy - and once you have finished converting you can continue to use the addin to save time documenting your code, and ensure that all new comments continue in the same style.

查看更多
3楼-- · 2019-07-03 23:02

The only one I can think of when I read the O'Reilly's book on Lex + Yacc, was that there was code to output the comments on the command line, there was a section in chapter 2 that shows how to parse the code for comments including the // and /*..*/...There's a link on the page for examples, download the file progs.zip, the file you're looking for is ch2-09.l which needs to be built, it can be easily modified to output the comments. Then that can be used in a script to filter out 'Name', 'Description' etc...

I can post the instructions here on how to do this if you are interested?

Edit: I think I have found what you are looking for, a prebuilt comment documentation extractor here.

Hope this helps, Best regards, Tom.

查看更多
地球回转人心会变
4楼-- · 2019-07-03 23:12

I think as tommieb75 suggests, a proper parser is the way to handle this.

I'd suggest looking at ANTLR, since it supports re-writing the token buffer in-place, which I think would minimise what you have to do to preserve whitespace etc - see chapter 9.7 of The Definitive ANTLR reference.

查看更多
登录 后发表回答