CURL import character encoding problem

I'm using CURL to import some code. However, in french, all the characters come out funny. For example: BonjourÂ ...

I don't have access to change anything on the imported code. Is there anything I can do my side to fix this?

Thanks

标签： php encoding curl

5条回答

相关推荐>>

2楼-- · 2019-01-12 04:56

I had a similar problem. I tried to loop through all combinations of input and output charsets. Nothing helped! :(

However I was able to access the code that actually fetched the data and this is where the culprit lied. Data was fetched via cURL. Adding

 curl_setopt($ch,CURLOPT_BINARYTRANSFER,true);

fixed it.

A handy set of code to try out all possible combinations of a list of charsets:

$charsets = array(  
        "UTF-8", 
        "ASCII", 
        "Windows-1252", 
        "ISO-8859-15", 
        "ISO-8859-1", 
        "ISO-8859-6", 
        "CP1256"
        ); 

foreach ($charsets as $ch1) { 
    foreach ($charsets as $ch2){ 
        echo "<h1>Combination $ch1 to $ch2 produces: </h1>".iconv($ch1, $ch2, $text_2_convert); 
    } 
}

0人赞添加讨论(0) 举报

【Aperson】

3楼-- · 2019-01-12 05:01

You could replace your

$data = curl_exec($ch);

$data = utf8_decode(curl_exec($ch));

I had this same issue and it worked well for me.

0人赞添加讨论(0) 举报

萌系小妹纸

4楼-- · 2019-01-12 05:06

PHP seems to use UTF-8 by default, so I found the following works

$text = iconv("UTF-8","Windows-1252",$text);

0人赞添加讨论(0) 举报

叼着烟拽天下

5楼-- · 2019-01-12 05:07

Like Jon Skeet pointed it's difficult to understand your situation, however if you have access only to final text, you can try to use iconv for changing text encoding.

I.e.

$text = iconv("Windows-1252","UTF-8",$text);

I've had similar issue time ago (with Italian language and special chars) and I've solved it in this way.

Try different combination (UTF-8, ISO-8859-1, Windows-1252).

0人赞添加讨论(0) 举报

Animai°情兽

6楼-- · 2019-01-12 05:17

I'm currently suffering a similar problem, i'm trying to write a simple html <title> importer cia cURL. So i'm going to give an idea of what i've done until now:

Retrieve the HTML via cURL
Check if there's any hint of encoding on the response headers via curl_getinfo() and match it via regex
Parse the HTML for the purpose of looking at the content-type meta and the <title> tag (yes, i know the consequences)
Compare both content-type, header and meta and choose the meta one if it's different, because we know noone cares about their httpd configuration and there are a lot of dirt workarounds using it
iconv() the string
Whish everyday that when someone does not follow the standards $DEITY punishes him/her until the end of the days, because it would save me the meta parsing

0人赞添加讨论(0) 举报

CURL import character encoding problem

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间