I am studying for the java OCP and at the moment I am stuck at understanding the "Capturing groups" section. It is a way too abstract as a description. Could you please (if you have time) give me some real examples using "Capturing groups"?
Is anybody able to provide me with a concrete example of the following statement?
Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g". The portion of the input string that matches the capturing group will be saved in memory for later recall via backreferences (as discussed below in the section, Backreferences).
I am pretty sure I'll get it as soon as I see a concrete example.
Thanks in advance.
Among other things, regex lets you obtain portions of the input that were matched by various parts of the regular expression. Sometimes you need the entire match, but often you need only a part of it. For example, this regular expression matches
"Page X of Y"
strings:If you pass it a string
you will match the entire string. Now let's say that you want only
14
and203
. No problem - regex library lets you enclose the two\d+
in parentheses, and then retrieve only the"14"
and"203"
strings from the match.The above expression creates two capturing groups. The
Matcher
object obtained by matching the pattern lets you retrieve the content of these groups individually:This prints
14
and203
.Demo on ideone.
It's for it you want to keep track of parts of the match. For example, if you have the regex
/^(http|ftp).*/
and you get a match, you can query the match for the group, and tell if it was http or ftp.
Here you see some code examples you can easily understand.
Basically what you have within
()
you have remembered after the match. And you can see the string matching that group. Remember that if you do a second match, these values are replaced by the second match so if you need them, you need to save them immediately after match in some variabled defined by you.The capturing groups allow to query the Matcher to find out what the part of the string was that matched against a particular part of the regular expression, see this example:
The variables year, month and day contains the value of groups 1, 2 and 3, respectively.
For example take the regex
This will match both the strings
cat dog bus
andcat bus
. That's because the entiredog
part is optional because of the?
. If you did not wrap it in paren, then only the last space would be optional.will match the string
as it will match one or more of the entire
had
string.You can also use alternation and back references with capture groups (something you haven't quite gotten to yet).
The
\1
is a reference to whatever was captured in the first capture group. This will matchdog is a dog
andcat is a cat
, but notdog is a cat
or vice versa.