Character encoding JSP -displayed wrong in JSP but

2019-01-11 19:56发布

问题:

I have this Web Application in JSP running on JBoss Application Server. I am using Servlets for friendly urls. I'm sending search parameters through my JSP's and Servlets. I am using a form with a text box, the Servlet

The first Servlet uses request.getParameter() to get the text, and sends it to another Servlet with response.sendRedirect (masking the URL to something "friendly"). This final Servlet uses request.getRequestDispatcher().forward() to send the parameters to the JSP in the "ugly" way: searchResults.jsp?searchParameters=Parameters.

Now, when the Search Results page is displayed, the URL displays the correct search term with "friendly url". Example: http://site.com/search/My-Search-Query even when using special characters like: http://site.com/search/Busqué-tildes-y-eñies. But when I try to use that search term in my JSP, the special characters are not displayed correctly.

The whole system uses i18n, and we've had no problems with special characters so far. But when the information is sent through the form (say from index.jsp to searchResults.jsp) special characters are not correctly displayed:

á - á
é - é
í - Ã
ó - ó
ú - ú
ñ - ñ

The whole code base is supposed to be in UTF-8, but apparently I'm missing something when passing the parameters. As I said, they are correctly displayed in the URL, but not inside the JSP.

I was thinking of converting those á manually, but I guess there's a better way to do it correctly, using the correct encoding. Besides, there can be new characters later which I may not be aware of right now (French, Spanish, etc.)

Just in case, I'll let you know I have these lines on each JSP:

<?xml version="1.0" encoding="UTF-8" ?>
<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>

EDIT

Thanks for your answers. I tried a few things, but nothing has fixed the problem.

Here's what I've done:

  • I added a ServletRequestListener which sets the session's character encoding to UTF-8, and a Filter for every Http request, which does the same.

  • As I said, everything in the JSPs is encoded with UTF-8 (see headers in question).

  • I printed the Servlets' character encoding to the console, which were null by default, set them to UTF-8 like @kgiannakakis and @saua said.

None of these actions fixes the problem. I'm wondering if there's something else wrong with this...

回答1:

Try to set URIEncoding in {jboss.server}/deploy/jboss-web.deployer/server.xml.

Ex:

<Connector port="8080" address="${jboss.bind.address}"    
     maxThreads="250" maxHttpHeaderSize="8192"
     emptySessionPath="true" protocol="HTTP/1.1"
     enableLookups="false" redirectPort="8443" acceptCount="100"
     connectionTimeout="20000" disableUploadTimeout="true" URIEncoding="UTF-8" />


回答2:

Just a wild guess. Try this inside your JSP/Servlet:

if(request.getCharacterEncoding() == null) {
   request.setCharacterEncoding("UTF-8");
}

You need to be sure that the correct encoding is passed to your servlet.



回答3:

response.setCharacterEncoding("UTF-8");



回答4:

The problem is that the information sent by the browser hasn't got a well-defined encoding and there's no way in HTTP to specify it.

Luckily most browsers will use the encoding of the page that contains the form. So if you use UTF-8 in all your pages, then most browsers will send all data in UTF-8 encoding as well (and your examples show that that's exactly how it is sent).

Unfortunately the most common Java application servers don't really handle the case (can't blame them, it's mostly guesswork anyway).

You can tell your application server to treat any input as UTF-8, by calling

request.setCharacterEncoding("UTF-8");

Based on your coding style and the frameworks you use, it might be to late when the control flow reaches your code, so it might be possible to do that in a javax.servlet.Filter.



回答5:

Check out the connecter setting in your tomcat config. There is an option (URIEncoding) you can set to treat URIs as UTF-8. By default they are treated as ISO-8859-1.



回答6:

We had a similar problem. It was solved when all JSPs have been saved with the UTF-8 BOM.



回答7:

First off, I have no idea how to solve this, since I don't know much about Java and JSP.

Having said that: the characters on the right-hand side of your table are the UTF-8 encoding of the left-hand side. That is, somewhere in your code, you're interpreting bytes as Latin-1 (or whatever your default encoding is), where they actually represent UTF-8 encoded characters...



回答8:

I think the problem might be that the browser does not specify the form post to be utf-8. There is a lot to read about form posts and encodings on the web, multiple web frameworks provide character encoding filters to 'fix' this issue, maybe just like your idea for a fix was - see for example http://static.springframework.org/spring/docs/2.5.x/api/org/springframework/web/filter/CharacterEncodingFilter.html



回答9:

Do you use RequestDumper? If it is configured in deploy/jboss-web.deployer/server.xml then try to remove it and then test your encoding.



回答10:

There are three layers to configure. From what you've described, it sounds like your problem lies in the database configuration.

  1. Browser Display and Form Submission

JSP

<%@page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8"%>

HTML

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  1. Web Server Processing

JSP

<%
  request.setCharacterEncoding("UTF-8");
  String name = request.getParameter("NAME");
%>

Same type of thing in a Servlet. See JBoss specific solution as well as complete server independent solution in this answer.

  1. Database Settings

You may be losing the character information at the database level. Check to make sure your database encoding is UTF-8 also, and not ASCII.

For a complete discussion of this topic, refer to Java article Character Conversions from Browser to Database.