I have a mysql database that's set to utf-8.
I have set my php header to: header("Content-Type: text/html; charset=utf-8");
and in my html: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
When I return anything that has round quotes or apostrophes, they show up as unrecognized characters (black diamond with a ? inside).
If I run utf8_encode () on the string I'm echoing out, it looks fine in Chrome, but shows a different weird character in Firefox. Is there something else I can do site-wide to make this work better?
(I've accessed the db with sequel pro and phpmyadmin)
full utf-8 settings:
1) .htaccess
AddDefaultCharset utf-8
PHP_VALUE default_charset utf-8
2) after mysqli_connect() in php call this:
mysqli_query($this->link, 'SET character_set_client="utf8",character_set_connection="utf8",character_set_results="utf8"; ');
3) your DB should be created with "collation: utf8" charset; all fields in table also should be "collation: utf8"
4) your PHP files also should be created with utf8 charset
Make sure the communication method is in UTF-8. Otherwise, it will be converted.
See mysql_client_encoding and mysql_set_charset
have you tried using htmlentities?
i know that this doesn't affect the character encoding,
but it might get rid of the black square with the question mark.
it often does for me...
$output = htmlentities($db_output);
echo $output;
How exactly are you getting these "round quotes and apostrophes"? If their ultimate source is a Word or Outlook document, they will be encoded in Windows-1252. If you copy and paste directly from a Word document into a UTF-8 Web page, the UTF-8 version of the clipboard should be used, and these characters come over as multibyte UTF-8 characters. If these characters went through other files or non-UTF-8 Web pages first, it's possible that they remained in Word "Smart Quote" single-byte encoding, which is invalid in UTF-8 (and thus the ?-in-black-diamond glyph). Note that Web pages claiming to be Latin-1 (ISO-8859-1) are frequently rendered as Windows-1252, as 1) the control codes x80-x9F that Smart Quotes overlay are very rarely used, and 2) it's so common for Smart Quotes to be mixed in with text.
For a UTF-8 page that gives quotes and apostrophes as "invalid characters", tell the browser to use Windows-1252 encoding instead for the page (View > Character Encoding or something similar). If these characters show up correctly now, untranslated Smart Quotes were the problem. Unfortunately, once they're in the database, only manual editing will fix them.