Strip down everything, except alphanumeric and Eur

2020-07-17 15:46发布

I am working on validating my commenting script, and I need to strip down all non-alphanumeric chars except those used in Western Europe.

My plan is to regex out all non-alphanumeric characters with:

preg_replace("/[^A-Za-z0-9 ]/", '', $string);

But that so far strips out all European characters and a £ sign, so "Café Rouge" becomes "Caf Rouge".

How can I add an array of Euro chars to the above regex.

The array is:

£, €, 
á, à, â, ä, æ, ã, å,
è, é, ê, ë,
î, ï, í, ì,
ô, ö, ò, ó, ø, õ,
û, ü, ù, ú,
ÿ,
ñ,
ß

I use UTF-8

SOLUTION:

$comment = preg_replace('/[^\p{Latin}\d\s\p{P}]/u', '', $comment);

and

$name = preg_replace('/[^\p{Latin}]/u', '', $name);

$name aslo removes punctuation marks and spaces

Thanks for quick replies

2条回答
戒情不戒烟
2楼-- · 2020-07-17 16:13
echo preg_replace('/[^A-Z0-9 £€áàâä...]/ui', '', $string);

The important part is the /u flag. Make sure your source code and $string are UTF-8 encoded.

I still think it's the wrong approach, because it severely limits what your users can enter and it will annoy some, but whatever floats your boat... BTW, your list contains no punctuation characters.

查看更多
可以哭但决不认输i
3楼-- · 2020-07-17 16:15
preg_replace('/[^\p{Latin}\d ]/u', '', $str);
查看更多
登录 后发表回答