I was using the standard \b
word boundary. However, it doesn't quite deal with the dot (.) character the way I want it to.
So the following regex:
\b(\w+)\b
will match cats
and dogs
in cats.dog
if I have a string that says cats and dogs don't make cats.dogs
.
I need a word boundary alternative that will match a whole word only if:
- it does not contain the dot(.) character
- it is encapsulated by at least one space( ) character on each side
Any ideas?!
P.S. I need this for PHP
You could try using (?<=\s)
before and (?=\s)
after in place of the \b
to ensure that there is a space before and after it, however you might want to also allow for the possibility of being at the start or end of the string with (?<=\s|^)
and (?=\s|$)
This will automatically exclude "words" with a .
in them, but it would also exclude a word at the end of a sentence since there is no space between it and the full stop.
What you are trying to match can be done easily with array and string functions.
$parts = explode(' ', $str);
$res = array_filter($parts, function($e){
return $e!=="" && strpos($e,".")===false;
});
I recommend this method as it saves time. Otherwise wasting few hours to find a good regex solution is quite unproductive.