C# regex for negated character class unless chars

2019-08-16 04:47发布

问题:

I need to match the characters between the innermost set of parentheses in a string, but allowing empty parens such as '()'. As best I can tell some kind of negative lookahead is needed here (and it is completely different than the question for which it is marked as duplicate)

An initial version, which does not properly include '()' is:

var re = new Regex(@"\(([^()]+)\)");

Some test examples:

x (a) y          -> a
x (a b) y        -> a b
x (a b c) y      -> a b c
x (a b() c) y    -> a b() c
x (a() b() c) y  -> a() b() c
x (a b() c()) y  -> a b() c()
x (a b(a) c) y   -> a
x (a (b() c)) y  -> b() c
x () y           -> empty

 

And a c# test method (adapt for your assertion library):

var re = new Regex(@"\(([^()]+)\)");

string[] tests = {
    "x (a) y", "a",
    "x (a b) y", "a b",
    "x (a b c) y", "a b c",
    "x (a b() c) y", "a b() c",
    "x (a() b() c) y", "a() b() c",
    "x (a b() c()) y", "a b() c()",
    "x (a b(a) c) y", "a",
    "x (a (b() c)) y", "b() c",
    "x () y", ""
};

for (int i = 0; i < tests.Length; i+=2)
{
    var match = re.Match(tests[i]);
    var result = match.Groups[1].Value;
    Assert.That(result, Is.EqualTo(tests[i + 1]));
}

回答1:

You could use something like this: (No need for Lookarounds)

\(((?:[^()]|\(\))+)\)

The adjustments made to your regex:

Added the [^()] in a non-capturing group along with an alternative |\(\) so it can match either a character other than ( and ) or empty parentheses ().

Try it online.

Alternatively, you could get rid of the capturing group if you don't need it and have your matches in full matches instead of groups by using Lookarounds like this:

(?<=\()(?:[^()]|\(\))+(?=\))

That way, you can access your matches directly using match.Value instead of match.Groups[1].Value.

Here's a demo.

Please let me know if there's anything not clear.