I have a regex expression that I'm using to find all the words in a given block of content, case insensitive, that are contained in a glossary stored in a database. Here's my pattern:
/($word)/i
The problem is, if I use /(Foo)/i
then words like Food
get matched. There needs to be whitespace or a word boundary on both sides of the word.
How can I modify my expression to match only the word Foo
when it is a word at the beginning, middle, or end of a sentence?
To match any whole word you would use the pattern
(\w+)
Assuming you are using PCRE or something similar:
Above screenshot taken from this live example: http://regex101.com/r/cU5lC2
Matching any whole word on the commandline with
(\w+)
I'll be using the phpsh interactive shell on Ubuntu 12.10 to demonstrate the PCRE regex engine through the method known as preg_match
Start phpsh, put some content into a variable, match on word.
The preg_match method used the PCRE engine within the PHP language to analyze variables:
$content1
,$content2
and$content3
with the(\w)+
pattern.$content1 and $content2 contain at least one word, $content3 does not.
Match a number of literal words on the commandline with
(dart|fart)
variables gun1 and gun2 contain the string dart or fart. gun4 does not. However it may be a problem that looking for word
fart
matchesfarty
. To fix this, enforce word boundaries in regex.Match literal words on the commandline with word boundaries.
So it's the same as the previous example except that the word
fart
with a\b
word boundary does not exist in the content:farty
.Use word boundaries:
Or if you're searching for "S.P.E.C.T.R.E." like in Sinan Ünür's example:
Using
\b
can yield surprising results. You would be better off figuring out what separates a word from its definition and incorporating that information into your pattern.Output:
If you are doing it in Notepad++
Would give you the entire word, and you can add parenthesis to get it as a group. Example:
conv1 = Conv2D(64, (3, 3), activation=LeakyReLU(alpha=a), padding='valid', kernel_initializer='he_normal')(inputs)
. I would like to moveLeakyReLU
into its own line as a comment, and replace the current activation. In notepad++ this can be done using the follow find command:and the replace command becomes:
The spaces is to keep the right formatting in my code. :)
use word boundaries \b,
The following (using four escapes) works in my environment: Mac, safari Version 10.0.3 (12602.4.8)