Servlet gets weird character with US International

I have a simple form where I can type some characters. These characters are sent to a servlet which does a getBytes and print the bytes. The correct UTF-8 bytes for a "ã" are -61 and -93, but I get -52 and -93. :(

I tried everything to understand and fix this, but nothing worked. Everything on my machine should be UTF-8 so I suspect it has to do with the US International keyboard I have been using for 20 years.

Does any smart soul have a clue from where -52 and -93 are coming from?

FIXED on Jetty: See my answer below.

BROKEN on Tomcat: How to get tomcat to understand MacRoman (x-mac-roman) charset from my Mac keyboard?

标签： java servlets character-encoding special-characters

2条回答

做自己的国王

2楼-- · 2019-03-31 00:20

Ok, after a good 8 hours (serious!) it looks like the only way to get this working correctly is to do:

One of the problems was: bad maven build encoding compilation of class files.

export JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8
mvn clean install

AND:

   <%@page pageEncoding="UTF-8" %>

NOW:

There is no way knowable to pass the latter option in your pom.xml.

Here is a pending answer for that: enabling UTF-8 encoding for clojure source files

0人赞添加讨论(0) 举报

老娘就宠你

3楼-- · 2019-03-31 00:32

That is the Mac OS Roman character encoding. (0xBB == -52.)

Some things to check:

getBytes(string, "UTF-8") and new String(bytes, "UTF-8").
The form should have been sent in UTF-8: response.setContentType("text/html; charset="UTF-8");. In a JSP <%@page pageEncoding="UTF-8"%>
<form action="..." accept-charset="UTF-8">

As all that did not help:

Set the request filtering in your web application (web-xml).

Encoding in pom.xml:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>...</version>
    <configuration>
        <source>1.6</source>
        <target>1.6</target>
        <encoding>${project.build.sourceEncoding}</encoding>
    </configuration>
</plugin>
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-resources-plugin</artifactId>
    <version>...</version>
    <configuration>
        <encoding>${project.build.sourceEncoding}</encoding>
    </configuration>
</plugin>
...
<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

0人赞添加讨论(0) 举报

Servlet gets weird character with US International

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间