Regex expression to match whole word ?

2019-01-12 13:18发布

问题:

In reference to my question Regex expression to match whole word with special characters not working ?,

I got an answer which said

@"(?<=^|\s)" + pattern + @"(?=\s|$)"

This works fine for all cases except 1 case. It fails when there is space in the pattern.

Assume string is "Hi this is stackoverflow" and pattern is "this " , then it says no matches. This happens because of an empty space after the actual string in pattern.

How can we handle this ? Ideally speaking it should say one match found !

回答1:

Try this

(?:(?<=^|\s)(?=\S)|(?<=\S|^)(?=\s))this (?:(?<=\S)(?=\s|$)|(?<=\s)(?=\S|$))

See it here on Regexr

This will also work for pattern that starts with a whitespace.

Basically, what I am doing is to define a custom "word" boundary. But it is not true on a \W=>\w or a \w=>\W change, its true on a \S=>\s or a \s=>\S change!

Here is an example in c#:

string str = "Hi this is stackoverflow";
string pattern = Regex.Escape("this");
MatchCollection result = Regex.Matches(str, @"(?:(?<=^|\s)(?=\S)|(?<=\S|^)(?=\s))" + pattern + @"(?:(?<=\S)(?=\s|$)|(?<=\s)(?=\S|$))", RegexOptions.IgnoreCase);

Console.WriteLine("Amount of matches: " + result.Count);
foreach (Match m in result)
{
    Console.WriteLine("Matched: " + result[0]);
}
Console.ReadLine();

Update:

This "Whitespace" boundary can be done more general, so that on each side of the pattern is the same expression, like this

(?:(?<=^|\s)(?=\S|$)|(?<=^|\S)(?=\s|$))

In c#:

MatchCollection result = Regex.Matches(str, @"(?:(?<=^|\s)(?=\S|$)|(?<=^|\S)(?=\s|$))" + pattern + @"(?:(?<=^|\s)(?=\S|$)|(?<=^|\S)(?=\s|$))", RegexOptions.IgnoreCase);


标签: .net regex match