I have these two strings:
$str1 = 'Ö';
$str2 = 'Ö';
$e1 = mb_detect_encoding($str1);
$e2 = mb_detect_encoding($str2);
var_dump($str1);
var_dump($str2);
echo 'e1: '.$e1.', e2: '.$e2;
the result is:
string(3) "Ö"
string(2) "Ö"
e1: UTF-8, e2: UTF-8
It seems that they are not only German characters but also each of them is different so converting them to ASCII this way
PHP: Replace umlauts with closest 7-bit ASCII equivalent in an UTF-8 string
doesn't produce equal results. Is there a way to convert both of these strings to one of these ASCII forms BNOE
or BNO
?
I know that maybe I could copy Ö from both and include in strtr
search and replace array but I don't know how to reproduce all the charactes encoded the same way the first Ös are.
These are two different forms to express the same letter in Unicode; one is the combination of an O with combining diereses, the other is the letter Ö. Unicode allows either variant to express "Ö".
To normalize that into your preferred variant, use
Normalizer::normalize
:Likely you want Form C, which will converge on "Ö" (the single letter form). If you prefer "O" + combining diereses, use Form D instead.
Extending Andreas's answer. These characters are letter + combining diaeresis(U-0308). I was able to search and replace them to standard umlauts, then replace with whatever is needed. This is the fuction I've used to replace them:
You could first convert your input to utf-8 using
iconv
and then apply your conversion to ASCII. To detect the current encoding you can usemb_detect_encoding
.Please note that you might have to add additional encodings to the encoding list of
mb_detect_encoding
.