converting ajax response from ISO-8859-1 to UTF8

2019-09-20 11:09发布

I use an Ajax call to receive a response in hebrew. The results come from a different site and are ISO-8859-1 encoded. My page is UTF-8. the response looks like Cyrillic:

îéãò ìî÷áì  áæ÷ äçáøä äéùøàìéú  àéï 

when I try to use the header on the ajax page:

header('Content-Type: text/html; charset=ISO-8859-1');

I get this result:

îéãò ìî÷áì  áæ÷ äçáøä äéùøà ìéú à éï  

utf8_encode on the response did not seem to help.

What should I do to decode correctly ?

Thanks!

Edit:

I did notice right now that the actual page that shows the data has an encoding of ISO-8859-1, but looking in the specific header of the response with the data I see the charset is set to windows-1255.

WHat I did now is setting the header to :

header('Content-Type: text/html; charset=windows-1255');

and on the php's side I added iconv and simply echo:
echo iconv("WINDOWS-1255","UTF-8",$response);

4条回答
狗以群分
2楼-- · 2019-09-20 11:39

The string you give in your question

îéãò ìî÷áì  áæ÷ äçáøä äéùøàìéú  àéï 

Looks like the Windows 1252 (Latin I) representation of the Windows 1255 (Hebrew) codepage:

EE Windows 1252 î Windows 1255 מ - HEBREW LETTER MEM
E9 Windows 1252 é Windows 1255 י - HEBREW LETTER YOD
E3 Windows 1252 ã Windows 1255 ד - HEBREW LETTER DALET
F2 Windows 1252 ò Windows 1255 ע - HEBREW LETTER AYIN
20 Windows 1252   Windows 1255   - SPACE
EC Windows 1252 ì Windows 1255 ל - HEBREW LETTER LAMED
EE Windows 1252 î Windows 1255 מ - HEBREW LETTER MEM
F7 Windows 1252 ÷ Windows 1255 ק - HEBREW LETTER QOF
E1 Windows 1252 á Windows 1255 ב - HEBREW LETTER BET
EC Windows 1252 ì Windows 1255 ל - HEBREW LETTER LAMED
20 Windows 1252   Windows 1255   - SPACE
20 Windows 1252   Windows 1255   - SPACE
E1 Windows 1252 á Windows 1255 ב - HEBREW LETTER BET
E6 Windows 1252 æ Windows 1255 ז - HEBREW LETTER ZAYIN
F7 Windows 1252 ÷ Windows 1255 ק - HEBREW LETTER QOF
20 Windows 1252   Windows 1255   - SPACE
E4 Windows 1252 ä Windows 1255 ה - HEBREW LETTER HE
E7 Windows 1252 ç Windows 1255 ח - HEBREW LETTER HET
E1 Windows 1252 á Windows 1255 ב - HEBREW LETTER BET
F8 Windows 1252 ø Windows 1255 ר - HEBREW LETTER RESH
E4 Windows 1252 ä Windows 1255 ה - HEBREW LETTER HE
20 Windows 1252   Windows 1255   - SPACE
E4 Windows 1252 ä Windows 1255 ה - HEBREW LETTER HE
E9 Windows 1252 é Windows 1255 י - HEBREW LETTER YOD
F9 Windows 1252 ù Windows 1255 ש - HEBREW LETTER SHIN
F8 Windows 1252 ø Windows 1255 ר - HEBREW LETTER RESH
E0 Windows 1252 à Windows 1255 א - HEBREW LETTER ALEF
EC Windows 1252 ì Windows 1255 ל - HEBREW LETTER LAMED
E9 Windows 1252 é Windows 1255 י - HEBREW LETTER YOD
FA Windows 1252 ú Windows 1255 ת - HEBREW LETTER TAV
20 Windows 1252   Windows 1255   - SPACE
20 Windows 1252   Windows 1255   - SPACE
E0 Windows 1252 à Windows 1255 א - HEBREW LETTER ALEF
E9 Windows 1252 é Windows 1255 י - HEBREW LETTER YOD
EF Windows 1252 ï Windows 1255 ן - HEBREW LETTER FINAL NUN

To convert that character set to UTF-8 you need to use a library that does this (e.g. iconv or mb_convert_encoding) or do it by yourself.

查看更多
何必那么认真
3楼-- · 2019-09-20 11:48

The response is not ISO-8859-1 encoded but probably windows-1255 encoded; interpreted that way, the bytes are מידע למקבל בזק החברה הישראלית אין. So try converting from windows-1255 to utf-8.

查看更多
姐就是有狂的资本
4楼-- · 2019-09-20 11:54

After banging my head in the wall for a while, I decided to be straightforward about it and created a mapping for the characters which worked very easily. I couldn't find a solution otherwise.

Here's the code:

$lat = array('à','á','â','ã','ä','å','æ','ç','è','é','ê','ë','ì','í','î','ï','ð','ñ','ò','ó','ô','õ','ö','÷','ø','ù','ú');
$heb = array('א','ב','ג','ד','ה','ו','ז','ח','ט','י','ך','כ','ל','ם','מ','ן','נ','ס','ע','ף','פ','ץ','צ','ק','ר','ש','ת');
echo str_replace($lat, $heb, $response);

I also found these resources valuable:

http://orwell.ru/test/CP/_?cp1252
http://orwell.ru/test/CP/_?cp1255

and this one too:

http://kanjidict.stc.cx/recode.php
查看更多
来,给爷笑一个
5楼-- · 2019-09-20 12:02

Compile PHP with '--enable-zend-multibyte'.
then please try this

mb_convert_encoding("FOO","UTF-8","ISO-8859-1");

If you only need to convert.

查看更多
登录 后发表回答