I have this Web Application in JSP running on JBoss Application Server. I am using Servlets for friendly urls. I'm sending search parameters through my JSP's and Servlets. I am using a form with a text box, the Servlet
The first Servlet uses request.getParameter()
to get the text, and sends it to another Servlet with response.sendRedirect
(masking the URL to something "friendly"). This final Servlet uses request.getRequestDispatcher().forward()
to send the parameters to the JSP in the "ugly" way: searchResults.jsp?searchParameters=Parameters
.
Now, when the Search Results page is displayed, the URL displays the correct search term with "friendly url". Example: http://site.com/search/My-Search-Query
even when using special characters like: http://site.com/search/Busqué-tildes-y-eñies
. But when I try to use that search term in my JSP, the special characters are not displayed correctly.
The whole system uses i18n, and we've had no problems with special characters so far. But when the information is sent through the form (say from index.jsp to searchResults.jsp) special characters are not correctly displayed:
á - á
é - é
í - Ã
ó - ó
ú - ú
ñ - ñ
The whole code base is supposed to be in UTF-8, but apparently I'm missing something when passing the parameters. As I said, they are correctly displayed in the URL, but not inside the JSP.
I was thinking of converting those á
manually, but I guess there's a better way to do it correctly, using the correct encoding. Besides, there can be new characters later which I may not be aware of right now (French, Spanish, etc.)
Just in case, I'll let you know I have these lines on each JSP:
<?xml version="1.0" encoding="UTF-8" ?>
<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>
EDIT
Thanks for your answers. I tried a few things, but nothing has fixed the problem.
Here's what I've done:
I added a ServletRequestListener which sets the session's character encoding to UTF-8, and a Filter for every Http request, which does the same.
As I said, everything in the JSPs is encoded with UTF-8 (see headers in question).
I printed the Servlets' character encoding to the console, which were null by default, set them to UTF-8 like @kgiannakakis and @saua said.
None of these actions fixes the problem. I'm wondering if there's something else wrong with this...
Try to set URIEncoding in {jboss.server}/deploy/jboss-web.deployer/server.xml.
Ex:
<Connector port="8080" address="${jboss.bind.address}"
maxThreads="250" maxHttpHeaderSize="8192"
emptySessionPath="true" protocol="HTTP/1.1"
enableLookups="false" redirectPort="8443" acceptCount="100"
connectionTimeout="20000" disableUploadTimeout="true" URIEncoding="UTF-8" />
Just a wild guess. Try this inside your JSP/Servlet:
if(request.getCharacterEncoding() == null) {
request.setCharacterEncoding("UTF-8");
}
You need to be sure that the correct encoding is passed to your servlet.
response.setCharacterEncoding("UTF-8");
The problem is that the information sent by the browser hasn't got a well-defined encoding and there's no way in HTTP to specify it.
Luckily most browsers will use the encoding of the page that contains the form. So if you use UTF-8 in all your pages, then most browsers will send all data in UTF-8 encoding as well (and your examples show that that's exactly how it is sent).
Unfortunately the most common Java application servers don't really handle the case (can't blame them, it's mostly guesswork anyway).
You can tell your application server to treat any input as UTF-8, by calling
request.setCharacterEncoding("UTF-8");
Based on your coding style and the frameworks you use, it might be to late when the control flow reaches your code, so it might be possible to do that in a javax.servlet.Filter
.
Check out the connecter setting in your tomcat config. There is an option (URIEncoding) you can set to treat URIs as UTF-8. By default they are treated as ISO-8859-1.
We had a similar problem. It was solved when all JSPs have been saved with the UTF-8 BOM.
First off, I have no idea how to solve this, since I don't know much about Java and JSP.
Having said that: the characters on the right-hand side of your table are the UTF-8 encoding of the left-hand side.
That is, somewhere in your code, you're interpreting bytes as Latin-1 (or whatever your default encoding is), where they actually represent UTF-8 encoded characters...
I think the problem might be that the browser does not specify the form post to be utf-8. There is a lot to read about form posts and encodings on the web, multiple web frameworks provide character encoding filters to 'fix' this issue, maybe just like your idea for a fix was - see for example http://static.springframework.org/spring/docs/2.5.x/api/org/springframework/web/filter/CharacterEncodingFilter.html
Do you use RequestDumper? If it is configured in deploy/jboss-web.deployer/server.xml then try to remove it and then test your encoding.
There are three layers to configure. From what you've described, it sounds like your problem lies in the database configuration.
- Browser Display and Form Submission
JSP
<%@page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8"%>
HTML
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
- Web Server Processing
JSP
<%
request.setCharacterEncoding("UTF-8");
String name = request.getParameter("NAME");
%>
Same type of thing in a Servlet. See JBoss specific solution as well as complete server independent solution in this answer.
- Database Settings
You may be losing the character information at the database level. Check to make sure your database encoding is UTF-8 also, and not ASCII.
For a complete discussion of this topic, refer to Java article Character Conversions from Browser to Database.