I am making a swedish website, and swedish letters are å, ä, and ö.
I need to make a string entered by a user to become url-safe with PHP.
Basically, need to convert all characters to underscore, all EXCEPT these:
A-Z, a-z, 1-9
and all swedish should be converted like this:
'å' to 'a' and 'ä' to 'a' and 'ö' to 'o' (just remove the dots above).
The rest should become underscores as I said.
Im not good at regular expressions so I would appreciate the help guys!
Thanks
NOTE: NOT URLENCODE...I need to store it in a database... etc etc, urlencode wont work for me.
If you're just interested in making things URL safe, then you want
urlencode
.If you really want to strip all non A-Z, a-z, 1-9 (what's wrong with
0
, by the way?), then you want:as simple as
assuming you use the same encoding for your data and your code.
Use
normalizer_normalize()
to get rid of diacritical marks.Use
preg_replace()
with a pattern of[\W]
(i.o.w: any character which doesn't match letters, digits or underscore) to replace them by underscores.Final result should look like:
One simple solution is to use str_replace function with search and replace letter arrays.
This should be useful which handles almost all the cases.
If intl php extension is enabled, you can use Transliterator like this :
To remove other special chars (not diacritics only like 'æ')