How can I strip multiline C comments from a file u

2019-01-11 12:07发布

问题:

Can anyone get me with the regular expression to strip multiline comments and single line comments in a file?

eg:

                  " WHOLE "/*...*/" HAS TO BE STRIPED OFF....."

1.   /* comment */
2.   /* comment1 */  code   /* comment2 */ #both /*comment1*/ and /*comment2*/ 
                                             #has to striped off and rest should 
                                                 #remain.
3.   /*.........
       .........
       .........
       ......... */

i realy appreciate you if u do this need.... thanks in advance.

回答1:

Including tests:

use strict;
use warnings;
use Test::More qw(no_plan);
sub strip_comments {
  my $string=shift;
  $string =~ s#/\*.*?\*/##sg; #strip multiline C comments
  return $string;
}
is(strip_comments('a/* comment1 */  code   /* comment2 */b'),'a  code   b');
is(strip_comments('a/* comment1 /* comment2 */b'),'ab');
is(strip_comments("a/* comment1\n\ncomment */ code /* comment2 */b"),'a code b');


回答2:

From perlfaq6 "How do I use a regular expression to strip C style comments from a file?":


While this actually can be done, it's much harder than you'd think. For example, this one-liner

perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c

will work in many but not all cases. You see, it's too simple-minded for certain kinds of C programs, in particular, those with what appear to be comments in quoted strings. For that, you'd need something like this, created by Jeffrey Friedl and later modified by Fred Curtis.

$/ = undef;
$_ = <>;
s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse;
print;

This could, of course, be more legibly written with the /x modifier, adding whitespace and comments. Here it is expanded, courtesy of Fred Curtis.

s{
   /\*         ##  Start of /* ... */ comment
   [^*]*\*+    ##  Non-* followed by 1-or-more *'s
   (
     [^/*][^*]*\*+
   )*          ##  0-or-more things which don't start with /
               ##    but do end with '*'
   /           ##  End of /* ... */ comment

 |         ##     OR  various things which aren't comments:

   (
     "           ##  Start of " ... " string
     (
       \\.           ##  Escaped char
     |               ##    OR
       [^"\\]        ##  Non "\
     )*
     "           ##  End of " ... " string

   |         ##     OR

     '           ##  Start of ' ... ' string
     (
       \\.           ##  Escaped char
     |               ##    OR
       [^'\\]        ##  Non '\
     )*
     '           ##  End of ' ... ' string

   |         ##     OR

     .           ##  Anything other char
     [^/"'\\]*   ##  Chars which doesn't start a comment, string or escape
   )
 }{defined $2 ? $2 : ""}gxse;

A slight modification also removes C++ comments, possibly spanning multiple lines using a continuation character:

 s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//([^\\]|[^\n][\n]?)*?\n|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $3 ? $3 : ""#gse;


回答3:

As often in Perl, you can reach for the CPAN: Regexp::Common::Comment should help you. The one language I found that uses the comments you described is Nickle, but maybe PHP comments would be OK (// can also start a single-line comment).

Note that in any case, using regexps to strip out comment is dangerous, a full-parser for the language is much less risky. A regexp-parser for example is likely to get confused by something like print "/*";.



回答4:

This is a FAQ:

perldoc -q comment

Found in perlfaq6:

How do I use a regular expression to strip C style comments from a file?

While this actually can be done, it's much harder than you'd think. For example, this one-liner ...



回答5:

There is also a non-perl answer: use the program stripcmt:

StripCmt is a simple utility written in C to remove comments from C, C++, and Java source files. In the grand tradition of Unix text processing programs, it can function either as a FIFO (First In - First Out) filter or accept arguments on the commandline.



回答6:

Remove /* */ comments (including multi-line)

s/\/\*.*?\*\///gs

I post this because it is simple, however I believe it will trip up on embedded comments like

/* sdafsdfsdf /*sda asd*/ asdsdf */

But as they are fairly uncommon I prefer the simple regex.



标签: c perl comments