php preg_replace - keep specified characters + for

2020-04-27 05:01发布

I need a function that removes all characters (not listed in pattern) from string but keeps foreign language letters. I know preg_replace has \p "pattern" but I can't get it working for some reason.

I use this function to remove all the crap from string:

$main_content=preg_replace("/[^a-zA-Z0-9`~!@#\$%\^&\*\(\)-_=\+\\|\,<\.>\/\?;:'\"\[\]\s]/", "", $main_content); //remove all symbols that do NOT match these

Put simply, the function should keep all the standard letters/numbers and standard symbols like +-!@#$ and so on, and remove all the crap like © ™ and so on. If there is a better way to write such preg_replace than I use, please let me know.

Now, I want the function to keep letters in foreign languages, so I modified it to

$main_content=preg_replace("/[^\p{L}a-zA-Z0-9`~!@#\$%\^&\*\(\)-_=\+\\|\,<\.>\/\?;:'\"\[\]\s]/", "", $main_content); //remove all symbols that do NOT match these

(You will notice \p{L} added). Unfortunately, it didn't work as expected. When I echo the text, I see that foreign languages were not removed (that's good) but they were converted into � (that's bad).

How do I fix it?

1条回答
干净又极端
2楼-- · 2020-04-27 05:33

\p{L} is available only with u modifier:

$main_content=preg_replace("/[^\p{L}]/u", "", $main_content);

Notice the u added after /

查看更多
登录 后发表回答