I've spent some time, but still have to solution. I need regular expression that is able to match a words with signs in it (like c++) in string.
I've used /\bword\b/
, for "usual" words, it works OK. But as soon as I try /\bC\+\+\b/
it just does not work. It some how works wrong with a plus signs in it.
I need a regex to detect if input string contains c++ word in it. Input like,
"c++ developer"
"using c++ language"
etc.
ps. Using C#, .Net Regex.Match function.
Thanks for help!
As the others said, your problem isn't the
+
sign you've escaped correctly but the\b
that is a zero-lenght char that match word boundary that takes place between word\w
and non-word\W
char.There is also another mistake in your regex, you want to match char
C
(uppercase) withc++
(lowercase).To do so you have to change your regex to/\bc\+\+/
or use thei
modifier to match case insensitive :/\bc\+\+/i
The problem isn't with the plus character, that you've escaped correctly, but the
\b
sequence. It indicates a word boundary, which is a point between a word character (alphanumeric) and something else. Plus isn't a word character, so for\b
to match, there would need to be a word character directly after the last plus sign.\bC\+\+\b
matches "Test C++Test" but not "Test C++ Test" for example. Try something like\bC\+\+\s
if you expect there to be a whitespace after the last plus sign.+
is a special character so you need to escape itNote that we can't use
\b
because+
is not a word-character.Plus sign have special meaning so you will have to escape it with
\
. The same rule applies to these characters:\, *, +, ?, |, {, [, (,), ^, $,., #,
and white spaceUPDATE: the problem was with
\b
sequenceIf you want to match a
c++
between non-word chars (chars other than letters, digits and underscores) you may useSee the regex demo where
\b
is a word boundary and\B
matches all positions that are not word boundary positions.C# syntax:
You must remember that
\b
/\B
are context dependent:\b
matches between the start/end of string and the adjoining word char or between a word and a non-word chars, while\B
matches between the start/end of string and the adjoining *non-*word char or between two word or two non-word chars.If you build the pattern dynamically, it is hard to rely on word boundary
\b
pattern.Use
(?<!\w)
and(?!\w)
lookarounds instead, they will always match a word not immediately preceded/followed with a word char:If the word boundaries you want to match are whitespace boundaries (i.e. the match is expected only between whitespaces), use