It feels like I'm stuck. I'm trying to write the simplest servlet Filter (and deploy it to tomcat). It's a groovy code, but actually I'm heavily using java approaches here, so it is almost copy-paste, that's the reason I've added java tag as well.
My question is - how can I insert UTF-8 string to filter? Here is the code:
public class SimpleFilter implements javax.servlet.Filter
{
...
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain)
throws java.io.IOException, javax.servlet.ServletException
{
PrintWriter out = response.getWriter()
chain.doFilter(request, wrapResponse((HttpServletResponse) response))
response.setCharacterEncoding('UTF-8')
response.setContentType('text/plain')
def saw = 'АБВГДЕЙКА ЭТО НЕПРОСТАЯ ПЕРЕДАЧА ABCDEFGHIJKLMNOP!!!'
def bytes = saw.getBytes('UTF-8')
def content = new String(bytes, 'UTF-8')
response.setContentLength(content.length())
out.write(content);
out.close();
}
private static HttpServletResponse wrapResponse(HttpServletResponse response) {
return new HttpServletResponseWrapper(response) {
@Override
public PrintWriter getWriter() {
def writer = new OutputStreamWriter(new ByteArrayOutputStream(), 'UTF-8')
return new PrintWriter(writer)
}
}
}
}
Content-Type of the filtered page is text/plain;charset=ISO-8859-1
.
So, content type have changed, but charset is ignored.
As you can see, I've take some measures (I guess quite naive) to make sure content is UTF-8, but none of these steps actually was helpful.
I've also tried to add URIEncoding="UTF-8"
or useBodyEncodingForUri="true"
attributes to
Connector in tomcat conf/server.xml
It would be nice if somebody explained me what I'm doing wrong.
UPD: just a bit of explanation - I'm writing XSLT-applying filter, that is the real reason I'm trying to discard whole request.
You are trying to set the content type after committing the response by calling getWriter. See the documentation on getWriter and setCharacterEncoding for details.
To fix you code just move the setting of content type and encoding a few lines earlier.
Does not change a thing between saw and content. What you want is to do (using the outputstream and not the writer, this is why the charset is reset to ISO-8859-1 See tomcat doc):
Your code looks okay to set the charset as UTF-8.
I don't understand what you are doing with HttpResponseWrapper.
To make it clear, this will work:
This might be the problem you're having, or at least it's one part of the problem. As the documentation of
setCharacterEncoding()
says:You should set the encoding, and only after, get the writer.