The following regular expression will match "Saturday" or "Sunday" : (?:(Sat)ur|(Sun))day
But in one case backreference 1 is filled while backreference 2 is empty and in the other case vice-versa.
PHP (pcre) provides a nice operator "?|" that circumvents this problem. The previous regex would become (?|(Sat)ur|(Sun))day
. So there will not be empty backreferences.
Is there an equivalent in C# or some workaround ?
.NET doesn't support the branch-reset operator, but it does support named groups, and it lets you reuse group names without restriction (something no other flavor does, AFAIK). So you could use this:
(?:(?<abbr>Sat)ur|(?<abbr>Sun))day
...and the abbreviated name will be stored in Match.Groups["abbr"]
.
should be possible to concat backref1 and backref2.
As one of each is always empty and a string concat with empty is still the same string...
with your regex (?:(Sat)ur|(Sun))day
and replacement $1$2
you get Sat
for Saturday
and Sun
for Sunday
.
regex (?:(Sat)ur|(Sun))day
input | backref1 _$1_ | backref2 _$2_ | 'concat' _$1$2_
---------|---------------|---------------|----------------
Saturday | 'Sat' | '' | 'Sat'+'' = Sat
Sunday | '' | 'Sun' | ''+'Sun' = Sun
instead of reading backref1 or backref2 just read both results and concat the result.
You can use the branch-reset operator:
(?|foo(bar)|still(life)|(like)so)
That will only set group one no matter which branch matches.