I need to replace all local characters (including upper case) with small ascii characters in a url string.
$str = "č-ć-đ-š-ž-Č-Ć-Đ-Š-Ž";
echo str_ireplace(array('č', 'ć', 'đ', 'š', 'ž'), array('c', 'c', 'd', 's', 'z'), $str);
result - c-c-d-s-z-Č-Ć-Đ-Š-Ž
I expected - c-c-d-s-z-c-c-d-s-z
How to get expected result using str_ireplace()
function.
Most of the PHP string functions handle the strings as sequences of bytes, i.e. single-byte characters (ASCII
).
You want to replace characters in a string that contains multi-byte characters.
str_replace()
(kind of) works because it doesn't care to interpret the strings as characters. It replaces a sequence of bytes with another sequence of bytes and that's all. Most of the times it will not break anything while working with ASCII
or even UTF-8
encoded strings (because the way UTF-8
was designed). However, it can produce unexpected results with other encodings.
When asked to handle characters outside the ASCII
range, [str_ireplace()](http://php.net/manual/en/function.str-ireplace.php) works the same as
str_replace()`. It's "case insensitive" functionality requires splitting the strings into chars and recognizing the lowercase-uppercase pairs. But since it doesn't handle multi-byte characters it cannot recognize any character whose code is greater than 127.
For multi-byte character strings you should use the functions provided by the Multibyte String PHP extension.
The only function it provides for strings replacement is mb_ereg_replace()
(with the case-insensitive version mb_eregi_replace()
) but they don't help you very much (because they don't work with arrays).
If the list of characters you want to replace is fixed, my suggestion is to use str_replace()
with a list of characters that includes both cases:
$str = "č-ć-đ-š-ž-Č-Ć-Đ-Š-Ž";
echo str_replace(
array('č', 'ć', 'đ', 'š', 'ž', 'Č', 'Ć', 'Đ', 'Š', 'Ž'),
array('c', 'c', 'd', 's', 'z', 'c', 'c', 'd', 's', 'z'),
$str
);
You are mixing up uppercase and lowercase characters. Č
has a different Unicode than č
, so they are not the same.
Try the following:
<?php
$str = "č-ć-đ-š-ž-Č-Ć-Đ-Š-Ž";
echo str_ireplace(array('č', 'ć', 'đ', 'š', 'ž', 'Č', 'Ć', 'Đ', 'Š', 'Ž'), array('c', 'c', 'd', 's', 'z', 'c', 'c', 'd', 's', 'z' ), $str);
?>
You can trans them to lower case first,
$str = "č-ć-đ-š-ž-Č-Ć-Đ-Š-Ž";
echo str_ireplace(array('č', 'ć', 'đ', 'š', 'ž'), array('c', 'c', 'd', 's', 'z'), mb_strtolower($str, "UTF-8"));