This question already has an answer here:
-
how to replace special characters with the ones they're based on in PHP?
8 answers
I am trying to parse a string, split it on what is not a letter or number
$parse_query_arguments = preg_split("/[^a-z0-9]+/i", 'København');
and construct a mysql query.
Even if I skip the preg_split and try to enter the string directly it breaks it into 2 different strings, 'K' and 'benhavn'.
How can I deal with these issues?
If you're using literal characters like a-z
then it won't match accented ones. You might want to use the various character classes available to do more generic matching:
/[[:alpha:][:digit]]/
The [:alpha:]
set is much broader in scope than a-z
. Remember character matching is done based on character code, and a-z
in order take, literally, characters between a
and z
by index. Characters like ø
lie outside this range even if they'd fall between that alphabetically.
Computers work in ASCII-abetical (UNICODEical?) order.
This might help explain what is going on in your regex... Regex and Unicode.
You could try something like \p{L}
as explained in this question