Regular expressions for a range of unicode points

2019-01-14 23:11发布

站内文章 / PHP

44 0

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm trying to strip all characters from a string from a string except:

I've got the first three conditions by doing this:

preg_replace('/[^a-zA-Z\d$_]+/', '', $foo);

How do I go about matching the fourth condition? I looked at using \X but there has to be a better way than listing out 65000+ characters.

You can use:

$foo = preg_replace('/[^\w$\x{0080}-\x{FFFF}]+/u', '', $foo);

\w - is equivalent of [a-zA-Z0-9_]
\x{0080}-\x{FFFF} to match characters between code points U+0080andU+FFFF`
/u for unicode support in regex

标签： php regex unicode preg-replace

傲

女 | 书童

私信

Ta的文章更多文章

0条评论

还没有人评论过~