Since i cant use preg_match (UTF8 support is somehow broken, it works locally but breaks at production) i want to find another way to match word against blacklist. Problem is, i want to search a string for exact match only, not first occurrence of the string.
This is how i do it with preg_match
preg_match('/\b(badword)\b/', strtolower($string));
Example string:
$string = "This is a string containing badwords and one badword";
I want to only match the "badword" (at the end) and not "badwords".
strpos('badword', $string) matches the first one
Any ideas?
Assuming you could do some pre-processing, you could use replace all your punctuation marks with white spaces and put everything in lowercase and then either:
strpos
with something like sostrpos(' badword ', $string)
in a while loop to keep on iterating through your entire document;So if you where trying the first option, it would something like so (untested pseudo code)
EDIT: As per @jonhopkins suggestion, adding a white space at the end should cater for the scenario where there wanted word is at the end of the document and is not proceeded by a punctuation mark.
If you want to mimic the
\b
modifier of regex you can try something like this:A simple way to use word boundaries with unicode properties:
In fact it's much more complicated, have a look at here.
You can use
strrpos()
instead ofstrpos
:Output: