Jetty Utf8Appendable$NotUtf8Exception on ISO-8859

2019-04-11 23:14发布

问题:

A Remote Service calls our Jetty Server with a Request encoded in ISO-8859-15. This special request is mapped on a Spring Controller. Jetty is not able to encode the request in right manner and shows the following exception:

exception=org.eclipse.jetty.util.Utf8Appendable$NotUtf8Exception: Not valid UTF8! byte F6 in state 3}
org.eclipse.jetty.util.Utf8Appendable$NotUtf8Exception: Not valid UTF8! byte F6 in state 3
    at org.eclipse.jetty.util.Utf8Appendable.appendByte(Utf8Appendable.java:168) ~[na:na]
    at org.eclipse.jetty.util.Utf8Appendable.append(Utf8Appendable.java:93) ~[na:na]
    at org.eclipse.jetty.util.UrlEncoded.decodeUtf8To(UrlEncoded.java:506) ~[na:na]
    at org.eclipse.jetty.util.UrlEncoded.decodeTo(UrlEncoded.java:554) ~[na:na]
    at org.eclipse.jetty.server.Request.extractParameters(Request.java:285) ~[na:na]
    at org.eclipse.jetty.server.Request.getParameter(Request.java:695) ~[na:na]
    ....

Solution

In Spring it's possible to force an encoding of the request through a CharacterEncodingFilter even if the whole application speaks UTF-8. The Exception should disappear.

<filter>
    <filter-name>encoding-filter</filter-name>
    <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
    <init-param>
        <param-name>encoding</param-name>
        <param-value>ISO-8859-15</param-value>
    </init-param>
    <init-param>
        <param-name>forceEncoding</param-name>
        <param-value>true</param-value>
    </init-param>
</filter>
<filter-mapping>
    <filter-name>encoding-filter</filter-name>
    <url-pattern>/app/specialRequest.do</url-pattern>
</filter-mapping>

If this is not working for you

  • find out the remote system encoding
  • start Wireshark to analyze incoming package through ip.src == xxx.xxx.xxx.xxx filter
  • search the requests body for special characters (recalculate the hex value to binary and try several frequently used encodings to find exactly the one who is matched the exception)
  • set encoding through Jetty's start.ini ie. with the following parameters

    Dorg.eclipse.jetty.util.URI.charset=ISO-8859-15

    Dorg.eclipse.jetty.util.UrlEncoding.charset=ISO-8859-15

Otherwise drop me a message if you have more questions.

回答1:

It looks like the client is sending text that should be encoded as UTF8, but isn't encoding it.

In order to properly diagnose this issue you'll need to understand UTF8 (which you might do, I don't know)

In UTF8 any character with an encoding of 127 (0x7F) or less - i.e. only the lowest 7 bit are used - is included in the stream as is (no special encoding). But anything greater than 127 (i.e. at least one bit higher than the 7th is set), is specially encoded.

0xF6 is greater than 0x7F so if a client wants to send that character, it should encoded it.

0xF6 in binary is 11110110, which in UTF8 should be 11000011 10110110 (C3 B6)

So, if the client wants to send the ISO8859-1 character of 0xF6, then it should be sending the UTF8 byte sequence of 0xC3 0xB6.

You really need to work out what the client wants to be sending, what charset/encoding that data is in, and why it's not converting it to valid UTF8 before it sends it.

( "state 3", is to do with Jetty's internal tables for doing UTF8 decoding, it's not really very helpful for diagnosing this problem. It will only come in handy if you find the client, and it looks like the client is doing the right thing, and you suspect that Jetty's UTF8 decoding is wrong)