Why do Javascript sub-matches stop working when the g
modifier is set?
var text = 'test test test test';
var result = text.match(/t(e)(s)t/);
// Result: ["test", "e", "s"]
The above works fine, result[1]
is "e"
and result[2]
is "s"
.
var result = text.match(/t(e)(s)t/g);
// Result: ["test", "test", "test", "test"]
The above ignores my capturing groups. Is the following the only valid solution?
var result = text.match(/test/g);
for (var i in result) {
console.log(result[i].match(/t(e)(s)t/));
}
/* Result:
["test", "e", "s"]
["test", "e", "s"]
["test", "e", "s"]
["test", "e", "s"]
*/
Using String
's match()
function won't return captured groups if the global modifier is set, as you found out.
In this case, you would want to use a RegExp
object and call its exec()
function. String
's match()
is almost identical to RegExp
's exec()
function…except in cases like these. If the global modifier is set, the normal match()
function won't return captured groups, while RegExp
's exec()
function will. (Noted here, among other places.)
Another catch to remember is that exec()
doesn't return the matches in one big array—it keeps returning matches until it runs out, in which case it returns null
.
So, for example, you could do something like this:
var pattern = /t(e)(s)t/g; // Alternatively, "new RegExp('t(e)(s)t', 'g');"
var match;
while (match = pattern.exec(text)) {
// Do something with the match (["test", "e", "s"]) here...
}
Another thing to note is that RegExp.prototype.exec()
and RegExp.prototype.test()
execute the regular expression on the provided string and return the first result. Every sequential call will step through the result set updating RegExp.prototype.lastIndex
based on the current position in the string.
Here's an example:
// remember there are 4 matches in the example and pattern. lastIndex starts at 0
pattern.test(text); // pattern.lastIndex = 4
pattern.test(text); // pattern.lastIndex = 9
pattern.exec(text); // pattern.lastIndex = 14
pattern.exec(text); // pattern.lastIndex = 19
// if we were to call pattern.exec(text) again it would return null and reset the pattern.lastIndex to 0
while (var match = pattern.exec(text)) {
// never gets run because we already traversed the string
console.log(match);
}
pattern.test(text); // pattern.lastIndex = 4
pattern.test(text); // pattern.lastIndex = 9
// however we can reset the lastIndex and it will give us the ability to traverse the string from the start again or any specific position in the string
pattern.lastIndex = 0;
while (var match = pattern.exec(text)) {
// outputs all matches
console.log(match);
}
You can find information on how to use RegExp
objects on the MDN (specifically, here's the documentation for the exec()
function).