Getting special characters out of a MySQL database

2020-02-11 04:52发布

问题:

This question already has answers here:
Closed 6 years ago.

I have a table that includes special characters such as ™.

This character can be entered and viewed using phpMyAdmin and other software, but when I use a SELECT statement in PHP to output to a browser, I get the diamond with question mark in it.

The table type is MyISAM. The encoding is UTF-8 Unicode. The collation is utf8_unicode_ci.

The first line of the html head is

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

I tried using the htmlentities() function on the string before outputting it. No luck.

I also tried adding this to php before any output (no difference):

header('Content-type: text/html; charset=utf-8');

Lastly I tried adding this right below the initial mysql connection (this resulted in additional odd characters being displayed):

$db_charset = mysql_set_charset('utf8',$db);

What have I missed?

回答1:

Below code works for me.

$sql = "SELECT * FROM chartest";
mysql_set_charset("UTF8");
$rs = mysql_query($sql);
header('Content-type: text/html; charset=utf-8');
while ($row = mysql_fetch_array($rs)) {
    echo $row['name'];
}


回答2:

There are a couple things that might help. First, even though you're setting the charset to UTF-8 in the header, that might not be enough. I've seen the browser ignore that before. Try forcing it by adding this in the head of your html:

<meta charset='utf-8'>

Next, as mentioned here, try doing this:

mysql_query ("set character_set_client='utf8'");
mysql_query ("set character_set_results='utf8'");
mysql_query ("set collation_connection='utf8_general_ci'");

EDIT

So I've just done some reading up an playing around a bit. First let me tell you, despite what I mentioned in the comments, utf8_encode() and utf8_decode() will not help you here. It helps to actually understand UTF-8 encoding. I found the Wikipedia page on UTF-8 very helpful. Assuming the value you are getting back from the database is in fact already UTF-8 encoded and you simply dump it out right after getting it then it should be fine.

If you are doing anything with the database result (manipulating the string in any way especially) and you don't use the unicode aware functions from the PHP mbstring library then it will probably mess it up since the standard PHP string functions are not unicode aware.

Once you understand how UTF-8 encoding works you can do something cool like this:

$test = "™";
for($i = 0; $i < strlen($test); $i++) { 
    echo sprintf("%b ", ord($test[$i]));
}

Which dumps out something like this:

11100010 10000100 10100010

That's a properly encoded UTF-8 '™' character. If you don't have a character like that in your data retrieved from the database then something is messed up.

To check, try searching for a special character that you know is in the result using mb_strpos():

var_dump(mb_strpos($db_result, '™'));

If that returns anything other than false then the data from the database is fine, otherwise we can at least establish that it's a problem between PHP and the database.



回答3:

you need to execute the following query first.

mysql_query("SET NAMES utf8");