Regex:
\b< low="" number="" low="">\b
Example string:
<b22>Aquí se muestran algunos síntomas < low="" number="" low=""> tienen el siguiente aspecto.</b22>
I'm not sure why the word boundary between síntomas and < is not being found. Same problem exists on the other side between > and tienen
Suggestions on how I might more properly match this boundary?
When I give it the following input, the Regex matches as expected:
Aquí se muestran algunos síntomas< low="" number="" low="">tienen el siguiente aspecto.
removing the edge conditions \b \bPHRASE\b
are not an option because it cannot match parts of words
Update
This did the trick: (Thanks to Igor, Mosty, DK and NickC)
Regex(String.Format(@"(?<=[\s\.\?\!]){0}(?=[\s\.\?\!])", innerStringToMatch);
I needed to improve my boundary matching to [\s\.\?\!]
and make these edge matches positive lookahead and lookbehind.
\b
is a zero-length match which can occur between two characters in the string, where one is a word character and the other is not a word character. Word character is defined as [A-Za-z0-9_]*.<
is not a word character, that's why\b
doesn't match.You can try the following regex instead (
(?: )
is a non-capturing parentheses group):*) Actually, this is not correct for all regex engines. To be precise, \b matches between
\w
and\W
, where\w
matches any word character. As Tim Pietzcker pointed out in the comment to this answer, the meaning of "word character" differs between implementations, but I don't know any where\w
matches<
or>
.I think you're trying to do the following: