how do i get rid of unrecognized characters in utf

2019-07-11 06:08发布

I have a mysql database that's set to utf-8. I have set my php header to: header("Content-Type: text/html; charset=utf-8"); and in my html: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

When I return anything that has round quotes or apostrophes, they show up as unrecognized characters (black diamond with a ? inside).

If I run utf8_encode () on the string I'm echoing out, it looks fine in Chrome, but shows a different weird character in Firefox. Is there something else I can do site-wide to make this work better?

(I've accessed the db with sequel pro and phpmyadmin)

4条回答
走好不送
2楼-- · 2019-07-11 06:19

How exactly are you getting these "round quotes and apostrophes"? If their ultimate source is a Word or Outlook document, they will be encoded in Windows-1252. If you copy and paste directly from a Word document into a UTF-8 Web page, the UTF-8 version of the clipboard should be used, and these characters come over as multibyte UTF-8 characters. If these characters went through other files or non-UTF-8 Web pages first, it's possible that they remained in Word "Smart Quote" single-byte encoding, which is invalid in UTF-8 (and thus the ?-in-black-diamond glyph). Note that Web pages claiming to be Latin-1 (ISO-8859-1) are frequently rendered as Windows-1252, as 1) the control codes x80-x9F that Smart Quotes overlay are very rarely used, and 2) it's so common for Smart Quotes to be mixed in with text.

For a UTF-8 page that gives quotes and apostrophes as "invalid characters", tell the browser to use Windows-1252 encoding instead for the page (View > Character Encoding or something similar). If these characters show up correctly now, untranslated Smart Quotes were the problem. Unfortunately, once they're in the database, only manual editing will fix them.

查看更多
smile是对你的礼貌
3楼-- · 2019-07-11 06:26

have you tried using htmlentities? i know that this doesn't affect the character encoding, but it might get rid of the black square with the question mark. it often does for me...

$output = htmlentities($db_output);
echo $output;
查看更多
甜甜的少女心
4楼-- · 2019-07-11 06:35

Make sure the communication method is in UTF-8. Otherwise, it will be converted.

See mysql_client_encoding and mysql_set_charset

查看更多
乱世女痞
5楼-- · 2019-07-11 06:40

full utf-8 settings:

1) .htaccess

AddDefaultCharset utf-8
PHP_VALUE default_charset utf-8

2) after mysqli_connect() in php call this:

mysqli_query($this->link, 'SET character_set_client="utf8",character_set_connection="utf8",character_set_results="utf8"; ');

3) your DB should be created with "collation: utf8" charset; all fields in table also should be "collation: utf8"

4) your PHP files also should be created with utf8 charset

查看更多
登录 后发表回答