In the environment I'm using (Tomcat 6), percent sequences in path segments apparently are decoded using ISO-8859-1 when being mapped to a @PathVariable.
I'd like that to be UTF-8.
I already configured Tomcat to use UTF-8 (using the URIEncoding attribute in server.xml).
Is Spring/Rest doing the decoding on its own? If yes, where can I override the default encoding?
Additional information; here's my test code:
@RequestMapping( value = "/enc/{foo}", method = RequestMethod.GET )
public HttpEntity<String> enc( @PathVariable( "foo" ) String foo, HttpServletRequest req )
{
String resp;
resp = " path variable foo: " + foo + "\n" +
" req.getPathInfo(): " + req.getPathInfo() + "\n" +
"req.getPathTranslated(): " + req.getPathTranslated() + "\n" +
" req.getRequestURI(): " + req.getRequestURI() + "\n" +
" req.getContextPath(): " + req.getContextPath() + "\n";
HttpHeaders headers = new HttpHeaders();
headers.setContentType( new MediaType( "text", "plain", Charset.forName( "UTF-8" ) ) );
return new HttpEntity<String>( resp, headers );
}
If I do an HTTP GET request with the following URI path:
/TEST/enc/%c2%a3%20and%20%e2%82%ac%20rates
which is the UTF-8 encoded then percent-encoded form of
/TEST/enc/£ and € rates
the output that I get is:
path variable foo: £ and ⬠rates
req.getPathInfo(): /enc/£ and € rates
req.getPathTranslated(): C:\Users\jre\workspace\.metadata\.plugins\org.eclipse.wst.server.core\tmp0\wtpwebapps\TEST\enc\£ and € rates
req.getRequestURI(): /TEST/enc/%C2%A3%20and%20%E2%82%AC%20rates
req.getContextPath(): /TEST
which to me shows that Tomcat (after setting the URIEncoding attribute) does the right thing (see getPathInfo()), but the path variable is decoded still in ISO-8859-1.
And the answer is:
Spring/Rest apparently uses the request encoding, which is a very strange thing to do, as this is about the body, not the URI. Sigh.
Adding this:
<filter>
<filter-name>CharacterEncodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>CharacterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
fixed the problem. It really should be simpler.
And actually, it's worse:
If the method indeed has a request body, and that one isn't encoded in UTF-8, the additional forceEncoding parameter is needed. This seems to work, but I'm concerned it will cause more problems later on.
Another approach
In the meantime, I found out that it's possible to disable the decoding, my specifying
<property name="urlDecode" value="false"/>
...in which case the recipient can to the right thing; but of course this will make lots of other things harder.