I'm having issues using word boundaries \b
in my regular expression. I'm using R but the issue exists as well when I try http://regexr.com. The pattern I'm using is \bs\.l\.\b
, and while I expected lines 1 and 3 below to match this pattern, only line 2 matches:
aaa s.l. bbb
aaa s.l.bbb
aaa s.l., bbb
See http://regexr.com/3f154 as well.
.
is not a word character, so there is no word boundary between the.
characters and the space or comma.The word boundaries match in the following positions:
Now, you want to match
s.l.
that is preceded with a word boundary, and not followed with a word char. You need to replace the trailing\b
with a(?!\w)
lookaround:See the regex demo
Use
perl=TRUE
if you are using base R functions, and it will work as is in stringr functions powered with ICU regex library.