I've got this regular expression which removes common words($commonWords
) from a string($input
) an I would like to tweak it so that it ignores hyphenated words as these sometimes contain common words.
return preg_replace('/\b('.implode('|',$commonWords).')\b/i','',$input);
thanks
Try
return preg_replace('/(?<!-)\b('.implode('|',$commonWords).')\b(?!-)/i','',$input);
This adds negative lookaround expressions to the start and end of the regex so that a match is only allowed if there is no dash before or after the match.
preg_replace('/\b('.implode('|',$commonWords).'|\w-\w)\b/i','',$input);
\w Any word character (letter, number, underscore)
it'll remove all all the commonwords, AND all the words who've a hyphene.
return preg_replace('/(?<![-\'"])\b('.implode('|',$commonWords).')\b(?![-'"])i','',$input);
The above will work if we have more symbols to be escaped.