Currently my site supports English, portuguese, swedish and polish. But for some reason some polish characters dont show right, like Zal�z konto
it should look like this Zalóz konto
I have this
// Send the Content-type header in case the web server is setup to send something else
header('Content-type: text/html; charset=utf-8');
and inside <head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
You need to convert your string to UTF8 also.
utf8_encode() does not check what encoding your string was in, and sometimes it gives you a messed up string, so I made a function called Encoding::toUTF8() to do this right.
You dont need to know what the encoding of your strings is. It can be Latin1 (iso 8859-1), Windows-1252 or UTF8, or the string can have a mix of them. Encoding::toUTF8() will convert everything to UTF8.
I did it because a service was giving me a feed of data all messed up, mixing those encodings in the same string.
Usage:
$utf8_string = Encoding::toUTF8($mixed_string);
$latin1_string = Encoding::toLatin1($mixed_string);
I've included another function, Encoding::fixUTF8(), wich will fix every UTF8 string that looks garbled product of having been encoded into UTF8 multiple times.
Usage:
$utf8_string = Encoding::fixUTF8($garbled_utf8_string);
Examples:
echo Encoding::fixUTF8("Fédération Camerounaise de Football");
echo Encoding::fixUTF8("Fédération Camerounaise de Football");
echo Encoding::fixUTF8("FÃÂédÃÂération Camerounaise de Football");
echo Encoding::fixUTF8("Fédération Camerounaise de Football");
will output:
Fédération Camerounaise de Football
Fédération Camerounaise de Football
Fédération Camerounaise de Football
Fédération Camerounaise de Football
Download:
http://dl.dropbox.com/u/186012/PHP/forceUTF8.zip
if you retrieve the data from mysql database with php you should use this query before do anything..
mysql_query("SET NAMES utf8");
So data received from db will be properly encoded, if they was properly stored in it...
Alternatively you can use the iso-8859-1 standard:
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
I start my header with:
< head>
< meta http-equiv="Content-Language" content="pl" >
< meta charset="UTF-8" >
...
< /head>
...
and all works fine.