I have a problem with the char encoding in yii. If I create a new webapp:
./Yii-framework/framework/yiic webapp MyTest
Then go to /protected/views/layouts/main.php and change the footer to a text with an utf8 character, for example
<div id="footer">
Cópyrîgth <br />
</div>
Refresh the page and everything is ok. Nice! ;)
And then I try to log in with an utf8 character in the username, for example ádmin, it crashes saying:
Error 500
htmlspecialchars(): Invalid multibyte sequence in argument
So I checked this article about unicode in yii
and then I went to /protected/config/main.php and added this line at the start:
header('Content-Type: text/html; charset=utf-8');
Retrying the same login again it works (doesn't crash) but now the footer is broken and shows:
C�pyr�ght
I've tried other combinations like explained in the "Unicode in yii" article but none of them make both things work at the same time.
Any ideas for solving this problem?
Note: I can't change to the php.ini file.
I also tried the AddDefaultCharset UTF-8 option in the .htaccess file and put it in the folder at the /MyTest/ is that the correct folder referred in the article as: your DocumentRoot ?
Thanks
I'm not at all familiar with yii, but, if you want to paste literal unicode characters into a file, you need to make sure that your text editor saves the file using a unicode encoding, such as utf8. Try utf8, without a BOM.
My experience is that text editors behave strange when you change the encoding setting and there's already encoded characters in it. Just start over with a fresh file, change the encoding, then paste the characters in.
First off, you need to understand that a character with a diacritic like ó or î (from your example) is not automatically a "utf-8 character". It is simply a character that has different encodings (if any) in different character sets, even in those character sets that have the basic single-byte ASCII part in common (i.e., the English alphabet, the digits, the most common punctuation, and a few more). You could call it a "problematic character", but not a "utf-8 character".
So, when you wrote your footer
<div>
, you did NOT write it UTF-8 encoded. Your editor saved those characters in a single-byte encoding, like ISO 8859-1 or one of its relatives.Browsers normally automatically detect the encoding used in a page, if it is not specified. This is why you were initially able to see in the browser exactly what you had written in your editor.
Then you tried to log in with a "problematic character" in the username. The browser had interpreted your page as having a single-byte encoding, so this caused it to encode your form input the same way, and send it single-byte-encoded back to the server. The PHP code had not been written with this possibility in mind, apparently, because it did not correctly set the third parameter of
htmlspecialchars()
, which is"UTF-8"
by default (starting from PHP 5.4.0 - it was"ISO-8859-1"
before). Since a single-byte encoded string with "problematic characters" almost never is a valid UTF-8 string (see my comment to your question, it's the second comment), htmlspecialchars() rejected it.Then you correctly added the
header('Content-Type: text/html; charset=utf-8');
, which disabled the automatic charset detection by the browser. At this point it became evident that your file with the footer<div>
was not UTF-8 encoded (see again my comment for the explanation of the question marks that appear instead of the "problematic characters").So all you are left to do is convince your editor to save files UTF-8 encoded. As others have noted, saving the file in a different encoding does not work in all editors. Starting from a fresh file is sometimes the solution, maybe after having set the default encoding of your editor to UTF-8.
To check the encoding, you can use the
file
command in a shell. Its output should be something likeOr else, you could use the
od -tx1z
command, which dumps your file (maybe| less
), as a sequence of hex bytes with the corresponding string on the side. If the file is single-byte encoded, your "problematic characters" will be single bytes >= 0x80. If it is UTF-8 encoded, they will be sequences of 2 bytes (others will be 3 or more bytes), all >= 0x80, while the "non-problematic characters" will continue to be single bytes < 0x80.The article you mention seems to be well-written, just follow it.
You don't need the
AddDefaultCharset
directive in the.htaccess
file, though, if all your pages are generated with theContent-Type: text/html; charset=utf-8
HTTP header, because the effect of the Apache directive is exactly the same (and it is good to keep the control on encoding inside PHP).Adding the
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
has the same effect, for the browser, as the above HTTP header (note the http-equiv). The HTTP header is cleaner, but this additional meta tag may help in case a page is saved without the header's information.Most importantly, don't be afraid of UTF-8, because it is your friend!
(...but, from the answer that got your bounty, I see that you, like many people, continue to think that understanding character encodings is too difficult for you ☹)
I had this problem too - specifically when I was trying to display utf text from db. I changed all the colations and types in mysql to utf8-bin - but still no love... then I tried to chnage all of my layouts and views with the meta tags etc... hell, I even looked at Japanese websites source code and pasted that stuff in... NOTHING WORKED _ ... UNTIL... I came across THIS post: Yii And UTF8 Display, UTF8 works with mysqli but not yii backend Turns out, you need to tweek a setting in my main.php in the config file, under components.. f
The best way around this is to use http://www.utexas.edu/learn/html/spchar.html - in your case Cópyrîght would appear as
Cópyrîght
Also, I'll add in the HTML
<meta charset="utf-8">
to make sure browsers are behaving themselves.The above solutions seems to be the right way since Yii doesnt really have a problem with unicodes but you could also perform some additional checks like the charset within the meta tag in your Html page is set to utf-8 and instead of writing plain html you can use Chtml::encode(Copyright) so that yii would handle the encoding. For the username part make sure the Default Charset in your database is also set to utf8.
First, you should remove the header call at the main.php file, it might create problems for you in the future.
Second, I would do what rambo coder suggested and make sure that your files are saved as UTF8 in your editor.