Regexp to find C comments

2019-07-25 13:18发布

I need a regexp in Ruby to find the comment before a C instruction.

For example I have this file example.c

/*
 * COMMENT NUMBER 1
 */
x = rb_define_class_under (foo, "MyClassName1", bar);

/*
 * COMMENT NUMBER 2
 */
y = rb_define_class_under (foo, "MyClassName2", bar);

/*
 * COMMENT NUMBER 3
 */
z = rb_define_class_under (foo, "MyClassName3", bar);

Then I have my parser in ruby parser.rb like this:

content = File.open('example.c').read

if content =~ /((?>\/\*.*?\*\/))([\w\.\s]+\s=\s)?rb_define_class_under.*?"(MyClassName1)"/m
  puts "Comment number 1 is:"
  puts $1
end

if content =~ /((?>\/\*.*?\*\/))([\w\.\s]+\s=\s)?rb_define_class_under.*?"(MyClassName2)"/m
  puts "Comment number 2 is:"
  puts $1
end

if content =~ /((?>\/\*.*?\*\/))([\w\.\s]+\s=\s)?rb_define_class_under.*?"(MyClassName3)"/m
  puts "Comment number 3 is:"
  puts $1
end

Now the output I expect is this:

Comment number 1 is:
/*
 * COMMENT NUMBER 1
 */
Comment number 2 is:
/*
 * COMMENT NUMBER 2
 */
Comment number 3 is:
/*
 * COMMENT NUMBER 3
 */

But I get:

Comment number 1 is:
/*
 * COMMENT NUMBER 1
 */
Comment number 2 is:
/*
 * COMMENT NUMBER 1
 */
Comment number 3 is:
/*
 * COMMENT NUMBER 1
 */

Any idea? What is the right regexp to get the expected output?

1条回答
三岁会撩人
2楼-- · 2019-07-25 14:14

Try adding .* to the beginning of the regex.

Currently the .*? after rb_define_class_under in your regex is causing you to always match and capture the first part of the string, and the .*? matches up until the class name you are actually looking for.

By adding a greedy match at the beginning of the regex you make sure that you only start your capture group at the last /* before the class name you want.

Example: http://www.rubular.com/r/Orja089zAI

Note that you still match from the beginning of the string, but the first capture group is the correct comment.

查看更多
登录 后发表回答