Regexp to ignore hyphenated words during common wo

2019-09-03 23:32发布

I've got this regular expression which removes common words($commonWords) from a string($input) an I would like to tweak it so that it ignores hyphenated words as these sometimes contain common words.

return preg_replace('/\b('.implode('|',$commonWords).')\b/i','',$input);

thanks

3条回答
Deceive 欺骗
2楼-- · 2019-09-03 23:52
preg_replace('/\b('.implode('|',$commonWords).'|\w-\w)\b/i','',$input);

\w Any word character (letter, number, underscore) it'll remove all all the commonwords, AND all the words who've a hyphene.

查看更多
姐就是有狂的资本
3楼-- · 2019-09-03 23:54

Try

return preg_replace('/(?<!-)\b('.implode('|',$commonWords).')\b(?!-)/i','',$input);

This adds negative lookaround expressions to the start and end of the regex so that a match is only allowed if there is no dash before or after the match.

查看更多
Anthone
4楼-- · 2019-09-03 23:57
return preg_replace('/(?<![-\'"])\b('.implode('|',$commonWords).')\b(?![-'"])i','',$input);

The above will work if we have more symbols to be escaped.

查看更多
登录 后发表回答