URLConnection encoding issue with accent character

2019-09-08 02:53发布

问题:

I've got an issue with the URLConnection encoding trying to send a text.

My code is this:

final URL url = new URL(urlString);
final URLConnection urlConnection = url.openConnection();
urlConnection.setDoInput(true);
urlConnection.setDoOutput(true);
urlConnection.setUseCaches(false);
urlConnection.setRequestProperty("Content-Type", "application/x-www-form-urlencoded;charset=utf-8");
urlConnection.setRequestProperty("Accept-Charset", "UTF-8");

final DataOutputStream urlDataOut = new DataOutputStream(urlConnection.getOutputStream());
urlDataOut.writeBytes(prepareData.toString());
urlDataOut.flush();
urlDataOut.close();

My prepareData.toString() contains a word with an "è" and as soon as the urlDataOut is written it will contain the diamond with the question mark instead of the "è" letter and the status of the write is FAILURE.

Does anybody know how to face this issue?

回答1:

The method DataOutputStream.writeBytes method is not suitable for any character encoding. Its documentation says:

Each character in the string is written out, in sequence, by discarding its high eight bits.

Using the method writeUTF will not be feasible either. It writes two bytes containing the length of the encoded String (number of bytes) which the server would interpret as characters at the beginning.

So you should use the standard way of writing text to an OutputStream:

Writer w=new OutputStreamWriter(
                  urlConnection.getOutputStream(), StandardCharsets.UTF_8);
w.write(prepareData.toString());
w.flush();
w.close();


标签: java url