PHP function iconv character encoding from iso-885

I'm trying to convert a string from iso-8859-1 to utf-8. But when I find these two charachter € and • the function returns a charachter that is a square with two number inside.

How can I solve this issue?

标签： php utf-8 character-encoding iconv

4条回答

The star\"

2楼-- · 2019-05-25 00:16

I think the encoding you are looking for is Windows code page 1252 (Western European). It is not the same as ISO-8859-1 (or 8859-15 for that matter); the characters in the range 0xA0-0xFF match 8859-1, but cp1252 adds an assortment of extra characters in the range 0x80-0x9F where ISO-8859-1 assigns little-used control codes.

The confusion comes about because when you serve a page as text/html;charset=iso-8859-1, for historical reasons, browsers actually use cp1252 (and will hence submit forms in cp1252 too).

iconv('cp1252', 'utf-8', "\x80 and \x95")
-> "\xe2\x82\xac and \xe2\x80\xa2"

0人赞添加讨论(0) 举报

贪生不怕死

3楼-- · 2019-05-25 00:19

Those 2 characters are illegal in iso-8859-1 (did you mean iso-8859-15?)

$ php -r 'echo iconv("utf-8","iso-8859-1//TRANSLIT","ter € and • the");'
ter EUR and o the

0人赞添加讨论(0) 举报

叼着烟拽天下

4楼-- · 2019-05-25 00:20

iso-8859-1 doesn't contain the € sign so your string cannot be interpreted with iso-8859-1 if it contains it. Use iso-8859-15 instead.

0人赞添加讨论(0) 举报

相关推荐>>

5楼-- · 2019-05-25 00:24

Always check your encoding first! You should never blindly trust your encoding (even if it is from your own website!):

function convert_cp1252_to_utf8($input, $default = '') {
    if ($input === null || $input == '') {
        return $default;
    }

    // https://en.wikipedia.org/wiki/UTF-8
    // https://en.wikipedia.org/wiki/ISO/IEC_8859-1
    // https://en.wikipedia.org/wiki/Windows-1252
    // http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT
    $encoding = mb_detect_encoding($input, array('Windows-1252', 'ISO-8859-1'), true);
    if ($encoding == 'ISO-8859-1' || $encoding == 'Windows-1252') {
        /*
         * Because ISO-8859-1 and CP1252 are identical except for 0x80 through 0x9F
         * and control characters, always convert from Windows-1252 to UTF-8.
         */
        $input = iconv('Windows-1252', 'UTF-8//IGNORE', $input);
    }
    return $input;
}

0人赞添加讨论(0) 举报

PHP function iconv character encoding from iso-885

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间