iconv returns strange results

2019-07-21 02:33发布

I'm working on a way to solve the problem with special characters in an automated script for creating accounts in PHP. Since special characters are unwanted in email addresses and other places I'm trying to get rid of them, but I can't remove them before feeding them to the script since the users name has to be displayed properly to other users.

Example: Jörgen Götz should get the email address jorgen.gotz@domain.com but in the user database his first name should still be Jörgen and his last name Götz. I hope I'm not to unclear about what I want to achieve.

I've been experimenting with iconv() but I'm having some trouble with it. See code below.

$utf8_sentence = 'Weiß, Goldmann, Göbel, Weiss, Göthe, Goethe und Götz';

setlocale(LC_ALL, 'en_GB');

echo $trans_sentence = iconv('UTF-8', 'ASCII//TRANSLIT', $utf8_sentence);

The code above should return

Weiss, Goldmann, Gobel, Weiss, Gothe, Goethe und Gotz

but instead it gives me

Weiss, Goldmann, G"obel, Weiss, G"othe, Goethe und G"otz

I can't understand what the quotations are doing there.

Both Chrome and IE gives me the same result and the page is using charset="utf-8".

Before using iconv() I tried using strtr() together with an array of "unwanted" characters but I don't like the sollution of having to set an array of special characters everytime I need to convert strings back and forth.

Can anyone offer an explanation or sollution?

2条回答
老娘就宠你
2楼-- · 2019-07-21 03:04

TRANSLIT tries to find characters that look similar to the requested character. Since the letter ö is not in ascii it is changing it to a pair for the umlat and the basic letter.

查看更多
做个烂人
3楼-- · 2019-07-21 03:07

Try adding this to your system (terminal in Ubuntu):

sudo locale-gen de_DE.UTF-8

Then changing the locale your php script:

setlocale(LC_ALL, 'de_DE.UTF-8');

Edit (Windows setup)

In Windows Server, you have to install the German Language Pack and change above to:

setlocale(LC_ALL, 'germany');
查看更多
登录 后发表回答