I store a json string that contains some (chinese ?) characters in a mysql database.
Example of what's in the database:
normal.text.\u8bf1\u60d1.rest.of.text
On my PHP page I just do a json_decode of what I receive from mysql, but it doesn't display right, it shows things like "½±è§�"
I've tried to execute the "SET NAMES 'utf8'" query at the beginning of my file, didn't change anything.
I already have the following header on my webpage:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
And of course all my php files are encoded in UTF-8.
Do you have any idea how to display these "\uXXXX" characters nicely?
Unicode is not UTF-8!
$ echo -en '\x8b\xf1\x60\xd1\x00\n' | iconv -f unicodebig -t utf-8
诱惑
This is a strange "encoding" you have. I guess each character of the normal text is "one byte" long (US-ASCII)? Then you have to extract the \u.... sequences, convert the sequence in a "two byte" character and convert that character with iconv("unicodebig", "utf-8", $character)
to an UTF-8 character (see iconv in the PHP-documentation). This worked on my side:
$in = "normal.text.\u8bf1\u60d1.rest.of.text";
function ewchar_to_utf8($matches) {
$ewchar = $matches[1];
$binwchar = hexdec($ewchar);
$wchar = chr(($binwchar >> 8) & 0xFF) . chr(($binwchar) & 0xFF);
return iconv("unicodebig", "utf-8", $wchar);
}
function special_unicode_to_utf8($str) {
return preg_replace_callback("/\\\u([[:xdigit:]]{4})/i", "ewchar_to_utf8", $str);
}
echo special_unicode_to_utf8($in);
Otherwise we need more Information on how your string in the database is encoded.
This seems to work fine for me, with PHP 5.3.5 on Ubuntu 11.04:
<?php
header('Content-Type: text/plain; charset="UTF-8"');
$json = '[ "normal.text.\u8bf1\u60d1.rest.of.text" ]';
$decoded = json_decode($json, true);
var_dump($decoded);
Outputs this:
array(1) {
[0]=>
string(31) "normal.text.诱惑.rest.of.text"
}
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
That's a red herring. If you serve your page over http, and the response contains a Content-Type
header, then the meta tag will be ignored. By default, PHP will set such a header, if you don't do it explicitly. And the default is set as iso-8859-1
.
Try with this line:
<?php
header("Content-Type: text/html; charset=UTF-8");