Having trouble setting up a URL connection with Chinese characters in the URL. It works with Latin characters:
String xstr = "维也纳恩斯特哈佩尔球场" ;
URI uri = new URI("http","ajax.googleapis.com","/ajax/services/language/detect","v=1.0&q="+xstr,null);
URL url = uri.toURL();
URLConnection connection = url.openConnection();
InputStream is = connection.getInputStream() ;
The getInputStream() call results in:
java.lang.IllegalArgumentException: Invalid uri 'http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=???????????': Invalid query
The problem is caused by the fact that
URI.toURL()
doesn't percent-encode non-ASCII characters. Use the following instead:axtavt's answer above saved me from insanity, thanks! Just one comment (I could not figure out how to comment below the answer:)
If you start with a URL, you need to encode quotes before you build the URI:
Per the URI RFC (see section 2.4), non-US-ASCII characters aren't valid in a URI. You must encode them.
I think it is related to the "UTF-8" charset. Have a look at this topic to learn more and also this chinese in java