How can I match all the “special” chars (like +_*&^%$#@!~
) except the char -
in PHP?
I know that \W
will match all the “special” chars including the -
.
Any suggestions in consideration of Unicode letters?
How can I match all the “special” chars (like +_*&^%$#@!~
) except the char -
in PHP?
I know that \W
will match all the “special” chars including the -
.
Any suggestions in consideration of Unicode letters?
You can try this pattern
([^a-zA-Z-])
This should match all characters that are not
a-z
and the-
[^-]
is not the special character you want[\W]
are all special characters as you know[^\w]
are all special characters as well - sounds fair?So therefore
[^\w-]
is the combination of both: All "special" characters but without-
.\pL
matches any character with the UnicodeLetter
character property, which is a major general category group; that is, it matches[\p{Ll}\p{Lt}\p{Lu}\p{Lm}\p{Lo}]
.\pN
matches any character with the UnicodeNumber
character property, which is a major general category group; that is, it matches[\p{Nd}\p{Nl}\p{No}]
.Alphabetic
characterproperty also includes certain combining marks such as U+0345 ◌ͅ ᴄᴏᴍʙɪɴɪɴɢ ɢʀᴇᴇᴋ ʏᴘᴏɢᴇɢʀᴀᴍᴍᴇɴɪ. I suggest you that you also include\pM
, which matches any character with the UnicodeMark
character property, which is a major general category group; that is, it matches[\p{Mn}\p{Me}\p{Mc}]
.-
you’re referring to.Dash
character property, including such common characters as U+2010 ʜʏᴘʜᴇɴ, U+2013 ᴇɴ ᴅᴀꜱʜ, U+2014 ᴇᴍ ᴅᴀꜱʜ, and U+2212 ᴍɪɴᴜꜱ ꜱɪɢɴ. Whether you actually want to include or exclude those, I have no idea.Given all that, it is not unlikely that you want something like: