Removing comments in JavaScript using Ruby

2019-08-01 08:00发布

I need a regex or something to remove this kind of comments.

/*!
 * Foo Bar
 */

I tried with /(\/*!.**\/)/m but fails. Any Suggestion?

7条回答
霸刀☆藐视天下
2楼-- · 2019-08-01 08:46

To do it accurately and efficiently, there is a better regex:

regexp = /\/\*![^*]*\*+(?:[^*\/][^*]*\*+)*\//
result = subject.gsub(regexp, '')

Jeffrey Friedl covers this specific problem at length (using C-comments as an example) in his classic work: Mastering Regular Expressions (3rd Edition). Here is a breakdown of the regex which illustrates the "Unrolling-the-Loop" efficiency technique.

regexp_long = / # Match she-bang style C-comment
    \/\*!       # Opening delimiter.
    [^*]*\*+    # {normal*} Zero or more non-*, one or more *
    (?:         # Begin {(special normal*)*} construct.
      [^*\/]    # {special} a non-*, non-\/ following star.
      [^*]*\*+  # More {normal*}
    )*          # Finish "Unrolling-the-Loop"
    \/          # Closing delimiter.
    /x
result = subject.gsub(regexp_long, '')

Note that this regex does not need Ruby's 'm' dot-matches-all modifier because it does not use the dot!

Additional: So how much more efficient is this regex over the simpler /\/\*!.*?\*\//m expression? Well using the RegexBuddy debugger, I measured how many steps each regex took to match a comment. Here are the results for both matching and non-matching: (For the non-,matching case I simply removed the last / from the comment)

/*!
 * This is the example comment
 * Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar
 * Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar
 * Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar
 */

'
REGEX                        STEPS TO: MATCH  NON-MATCH
/\/\*!.*?\*\//m                        488      491
/\/\*![^*]*\*+(?:[^*\/][^*]*\*+)*\//    23       29
'

As you can see, the lazy-dot solution (which must backtrack once for each and every character in the comment), is much less efficent. Note also that the efficiency difference is even more pronounced with longer and longer comments.

CAVEAT Note that this regex will fail if the opening delimiter occurs inside a literal string, e.g. "This string has a /*! in it!". To do this correctly with 100% accuracy, you will need fo fully parse the script.

查看更多
登录 后发表回答