chinese chars - PHP encoding

2019-05-22 15:20发布

问题:

I am trying to extract chinese words off a website.

I am using simple cURL code:

$curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($curl);

echo $response;

Expected result for one of words is

网络频率

However I get this:

ÍøÂçƵÂÊ

Also if I url encode word result is different.

I am having problems with encoding lately. Chinese chars are UTF8 or what? Could anyone help me chars would show "normal" with echo and if I url encode them result will be same as if I copy them off website.

Cheers

回答1:

Chinese is usually UTF-8, yes. The problem you're having is probably not that the data isn't received correctly (cURL knows what it's doing), but that you're not sending them correctly to the browser.

Try this on top of your page:

header('Content-Type: text/html; charset=utf-8');

This will tell the browser that you are sending UTF-8 information.

Update: if this doesn't work, it could be that PHP itself isn't handling them properly. Try playing with utf8_encode and utf8_decode a bit in your echo. If thàt doesn't work, then cURL isn't decoding the stream properly, which means you'll have to look for the Content-Type header in the response and decode the stream accordingly.



回答2:

Try this,

1) create a new document and make sure the document is UTF-8 compatible

2) Use metal tag :

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

3) I wouldn't recommend forcing header into using utf-8, but simply use ini_set

ini_set('default_charset', 'UTF-8');

if you are calling curl function from a different page, make sure that page is able to carry UTF-8 characters and pass it onto UTF-8 compatible page.



标签: php encoding cjk