How can I match all the “special” chars (like +_*&^%$#@!~
) except the char -
in PHP?
I know that \W
will match all the “special” chars including the -
.
Any suggestions in consideration of Unicode letters?
How can I match all the “special” chars (like +_*&^%$#@!~
) except the char -
in PHP?
I know that \W
will match all the “special” chars including the -
.
Any suggestions in consideration of Unicode letters?
[^-]
is not the special character you want[\W]
are all special characters as you know[^\w]
are all special characters as well - sounds fair?So therefore [^\w-]
is the combination of both: All "special" characters but without -
.
\pL
matches any character with the Unicode Letter
character property, which is a major general category group; that is, it matches [\p{Ll}\p{Lt}\p{Lu}\p{Lm}\p{Lo}]
.\pN
matches any character with the Unicode Number
character property, which is a major general category group; that is, it matches [\p{Nd}\p{Nl}\p{No}]
.Alphabetic
characterproperty also includes certain combining marks such as U+0345 ◌ͅ ᴄᴏᴍʙɪɴɪɴɢ ɢʀᴇᴇᴋ ʏᴘᴏɢᴇɢʀᴀᴍᴍᴇɴɪ. I suggest you that you also include \pM
, which matches any character with the Unicode Mark
character property, which is a major general category group; that is, it matches [\p{Mn}\p{Me}\p{Mc}]
.-
you’re referring to. Dash
character property, including such common characters as U+2010 ʜʏᴘʜᴇɴ, U+2013 ᴇɴ ᴅᴀꜱʜ, U+2014 ᴇᴍ ᴅᴀꜱʜ, and U+2212 ᴍɪɴᴜꜱ ꜱɪɢɴ. Whether you actually want to include or exclude those, I have no idea.Given all that, it is not unlikely that you want something like:
[^\pL\pN\pM\x2D\x{2010}-\x{2015}\x{2212}]
You can try this pattern
([^a-zA-Z-])
This should match all characters that are not a-z
and the -