I know there are hundreds of questions about UTF-8 woes but I tried all the approaches I could find, none of them helped.
The facts: I'm trying to read a string that contains a é from my MySQL database and display it on a PHP page. Actually, it does display as é (but the font does not recognize it as such and thus another default font is used). The troubles arose when I wanted to convert this string to a filename using PHP functions for string replacement. PHP does not recognize this as the é character at all.
Here's a quick rundown of what I'm doing:
1) The String is stored in a MySQL database. The MySQL server settings are:
MySQL connection collation utf8_unicode_ci
MySQL charset: UTF-8 Unicode (utf8)
The database itself is set to collation utf8_unicode_ci (MyISAM storage engine, not changeable due to shared server)
The actual table is set to collcation utf8_unicode_ci (InnoDB storage engine)
The é shows up correctly in phpMyAdmin. The data is inserted into the DB via a Java program but I have also tried this with manually entered data (entered in phpMyAdmin).
2) The PHP default_charset is not set (NO VALUE), I'm on a shared server and placing a manual override php.ini did not seem to work. Using ini_set("default_charset", 'utf-8');
works but has no effect on the problem I have.
3) Before I run the actual select query I query SET NAMES 'utf8'
. The query itself is irrelevant but for testing I chose a simple SELECT title FROM items WHERE item_id = 1
4) The PHP file itself is encoded UTF-8. I have set the correct charset for the html with <meta http-equiv="content-type" content="text/html; charset=utf-8" />
5) To test the problem I used htmlentities on the returned string (Astérix), checking the source code it is converted to Astérix
which is not correct of course. Accordingly, the string shows up as Astérix
in the browser.
What possible reason could there be for this? To me it seems like I set everything that can be set to UTF-8.
http://php.net/manual/en/ref.mbstring.php - look at multibyte string functions.