I updated my web app to use UTF-8 instead of ANSI.
I did the following measures to define charset:
mysql_set_charset("utf8"); // PHP
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> // HTML
utf8_general_ci // In MySQL
I also edited the CKEditor config to remove htmlentities because I need the correct character (i.e. é
and not é
) for MySQL fulltext search.
config.entities = false;
config.entities_latin = false;
In the database (phpMyAdmin view) and on normal text fields output (HTML, <input>
or <textarea>
), everything looks fine (I see é
, not é
, not é
, yay).
However, CKEditor has some trouble with the encoding. See attached image for the same field taken from the database, displayed in a textarea, then in a textarea repalced by CKEditor:
This seems to be in the CKEditor JavaScript code (probably a fixed charset), but I can't find it in the config. Again, since the é
displays correctly in normal HTML (real UTF-8 é
, not é
nor é
), I'm quite sure it's not the PHP/MySQL query that's wrong (but I might be mistaken).
EDIT: This seems like a symptom of applying htmlentities
, which by default is encoded in Latin-1, on UTF-8 text. There is either a possibility of using htmlspecialchars
or to specify the charset ("utf-8"), but I don't know where to modify that in CKEditor.
You can also use in your database connection:
$connection->query("SET NAMES 'utf8'");
And remember to set db, and/or table Collation to utf8... I preferutf8_general_ci
It was my approach that was wrong, not CKEditor's. Was looking in the wrong file and missed the UTF-8 encoding on a
htmlspecialchars
.This thread seems bit dated but answering it to help anyone looking for a response.
To allow
CKEditor
to process the characteré
asé
and noté
; set the config forentities_latin
tofalse
, as below:Or, you may just want to set following options to false: