I am in the process of fixing some bad UTF8 encoding. I am currently using PHP 5 and MySQL
In my database I have a few instances of bad encodings that print like: î
- The database collation is utf8_general_ci
- PHP is using a proper UTF8 header
- Notepad++ is set to use UTF8 without BOM
- database management is handled in phpMyAdmin
- not all cases of accented characters are broken
What I need is some sort of function that will help me map the instances of î, ÃÂ, ü and others like it to their proper accented UTF8 characters.
If you
utf8_encode()
on a string that is already UTF-8 then it looks garbled when it is encoded multiple times.I made a function
toUTF8()
that converts strings into UTF-8.You don't need to specify what the encoding of your strings is. It can be Latin1 (iso 8859-1), Windows-1252 or UTF8, or a mix of these three.
I used this myself on a feed with mixed encodings in the same string.
Usage:
My other function
fixUTF8()
fixes garbled UTF8 strings if they were encoded into UTF8 multiple times.Usage:
Examples:
will output:
Download:
https://github.com/neitanod/forceutf8
I found a solution after days of search. My comment is going to be buried but anyway...
I get the corrupted data with php.
I don't use set names UTF8
I use utf8_decode() on my data
I update my database with my new decoded data, still not using set names UTF8
and voilà :)
If you have double-encoded UTF8 characters (various smart quotes, dashes, apostrophe ’, quotation mark “, etc), in mysql you can dump the data, then read it back in to fix the broken encoding.
Like this:
This was a 100% fix for my double encoded UTF-8.
Source: http://blog.hno3.org/2010/04/22/fixing-double-encoded-utf-8-data-in-mysql/
As Dan pointed out: you need to convert them to binary and then convert/correct the encoding.
E.g., for utf8 stored as latin1 the following SQL will fix it:
i had the same problem long time ago, and it fixed it using
Another thing to check, which happened to be my solution (found here), is how data is being returned from your server. In my application, I'm using PDO to connect from PHP to MySQL. I needed to add a flag to the connection which said get the data back in UTF-8 format
The answer was