Here's a short piece of code:
var utility = {
escapeQuotes: function(string) {
return string.replace(new RegExp('"', 'g'),'\\"');
},
unescapeQuotes: function(string) {
return string.replace(new RegExp('\\"', 'g'),'"');
}
};
var a = 'hi "';
var b = utility.escapeQuotes(a);
var c = utility.unescapeQuotes(b);
console.log(b + ' | ' + c);
I would expect this code to work, however as a result I receive:
hi \" | hi \"
If I change the first parameter of the new RegExp constructor in the unescapeQuotes method to 4 backslashes everything starts working as it should.
string.replace(new RegExp('\\\\"', 'g'),'"');
The result:
hi \" | hi "
Why are four backslashes needed as the first parameter of the new RegExp constructor? Why doesn't it work with only 2 of them?
The problem is that you're using the
RegExp
constructor, which accepts a string, rather than using a regular expression literal. So in this line in your unescape:...the
\\
is interpreted by the JavaScript parser as part handling the string, resulting in a single backslash being handed to the regular expression parser. So the expression the regular expression parser sees is\"
. The backslash is an escape character in regex, too, but\"
doesn't mean anything special and just ends up being"
. To have an actual backslash in a regex, you have to have two of them; to do that in a string literal, you have to have four (so they survive both layers of interpretation).Unless you have a very good reason to use the
RegExp
constructor (e.g., you have to use some varying input), always use the literal form:It's a lot less confusing.