I'm facing a strange problem in one of my JSF (which is a facelet). I'm using Richfaces and on one page I got a normal form
<h:form></h:form>
My problem is when I submit the form all UTF-8 chars - like german umlauts (äöü) - are recieved encrypted. If I change the page to ISO-8859-1 on my browser it works.
If I expand the form with attribute
<h:form id="register-form" acceptcharset="ISO-8859-1">
it works too. (just for german umlauts) - other UTF-8 chars will be encrypted to something unreadable.
Does anyone could give me hand by this?
Put
ontop of your pages, and it should work fine.
Also:
in your template (or again, in every page, if not using a template)
I'm currently making a utf-8 project, and hanven't set UTF-8 except ontop of each jsp/xhtml.
I can't recall exactly what happens behind the scene, but I think this line (
<?xml
) is instructing facelets what encoding should be used. This line is not sent to the browser.P.S. The above is tested under MyFaces only (shouldn't matter, but still..)
You need to set the
POST
request encoding byHttpServletRequest#setCharacterEncoding()
. Best place for this is aFilter
which is mapped on the desiredurl-pattern
. To get world domination you of course want to useUTF-8
all the time. ThedoFilter()
method would basically look like:This is however not the only which you need to take into account with regard to character encoding. For more background information and another (detailed) solutions for a Java EE webapplication, you may find this article useful as well: Unicode - How to get the characters right?
Update: as per the comments:
Then the problem is more in the tool which you use to store/display the characters. How did you found out that the characters were garbled? In the logging statements? If so, does it use UTF-8? Or is it in the log file viewer/console? If so, does it use UTF-8? Or is it in the database table? If so, does it use UTF-8? Or is it in the database admin tool? If so, does it use UTF-8? Or is it in the result page? If so, does it use UTF-8? Etcetera.. Go through the solutions section of the aforementioned link how to get them all right.
This is the correct behavior. UTF-8 means that you want Unicode characters (i.e. non-ASCII or anything >= charpoint 128) must be encoded with two bytes.
But your JSF framework should decode the data into Unicode strings before your code can see it. So my guess is that you don't specify the encoding of the page or the form and therefore, your framework can only guess what it gets. Always set
acceptcharset
toutf-8
and the encoding of the whole HTML page to the same (using themeta
tag).Then it should work.
Links: Tips for JSF character encoding
How about
Not really meant as a fix, but if that makes all characters work, then it suggests that your real problem is that the page which contains the form is declared as US-ASCII. Browsers usually will send form submits in the encoding of the page unless
acceptcharset
says otherwise.But it's hard to diagnose encoding problems in webapps because there are so many potential failure points where encodings are involvend. Especially hard when your understanding of encodings is as spotty as indicated by your wrong terminology ("UTF-8 characters"). I suggest you first read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Once you've read that, take a look at the HTML source of the form page and the HTTP headers of that page and the form request to see what encodings are being used. You should then be able to figure out where things are going wrong.