Embed comments within JavaScript regex like in Per

2019-04-08 08:58发布

问题:

Is there any way to embed a comment in a JavaScript regex, like you can do in Perl? I'm guessing there is not, but my searching didn't find anything stating you can or can't.

回答1:

You can't embed a comment in a regex literal.

You may insert comments in a string construction that you pass to the RegExp constructor :

var r = new RegExp(
    "\\b"   + // word boundary
    "A="    + // A=
    "(\\d+)"+ // what is captured : some digits
    "\\b"     // word boundary again
, 'i');       // case insensitive

But a regex literal is so much more convenient (notice how I had to escape the \) I'd rather separate the regex from the comments : just put some comments before your regex, not inside.

EDIT 2018: This question and answer are very old. EcmaScript now offers new ways to handle this, and more precisely template strings.

For example I now use this simple utility in node:

module.exports = function(tmpl){
    let [, source, flags] = tmpl.raw.toString()
    .replace(/\s*(\/\/.*)?$\s*/gm, "") // remove comments and spaces at both ends of lines
    .match(/^\/?(.*?)(?:\/(\w+))?$/); // extracts source and flags
    return new RegExp(source, flags);
}

which lets me do things like this or this or this:

const regex = rex`
    ^         // start of string
    [a-z]+    // some letters
    bla(\d+)
    $         // end
    /ig`;

console.log(regex); // /^[a-z]+bla(\d+)$/ig
console.log("Totobla58".match(regex)); // [ 'Totobla58' ]


回答2:

Now with the grave backticky things, you can do inline comments with a little finagling. Note that in the example below there are some assumptions being made about what won't appear in the strings being matched, especially regarding the whitespace. But I think often you can make intentional assumptions like that, if you write the process() function carefully. If not, there are probably creative ways to define the little "mini-language extension" to regexes in such a way as to make it work.

function process() {
  var regex = new RegExp("\\s*([^#]*?)\\s*#.*$", "mg");
  var output = "";
  while ((result = regex.exec(arguments[0])) !== null ){
    output += result[1];
  }
  return output;
}
var a = new RegExp(process `
    ^f    # matches the first letter f
    .*   # matches stuff in the middle
    h    # matches the letter 'h'
`);
console.log(a);
console.log(a.test("fish"));
console.log(a.test("frog"));

Here's a codepen.

Also, to the OP, just because I feel a need to say this, this is neato, but if your resulting code turns out just as verbose as the string concatenation or if it takes you 6 hours to figure out the right regexes and you are the only one on your team who will bother to use it, maybe there are better uses of your time...

I hope you know that I am only this blunt with you because I value our friendship.