UTF8 Encoding in Android when invoking REST webser

2019-02-07 18:27发布

问题:

I'm invoking a rest WS that returns XML. Some elements have strings include special characters like áãç etc... When I get the information via browser all of it is shown properly but when invoking it from Android I don't get the proper special characters.

Notice the 'decoded' and 'encoded' variables:

when I use URLDecoder.decode(result, "UTF-8") The result stays the same

when I use URLEncoder.encode(result, "UTF-8") The result changes to what it would be expected (full of %'s symbols and numeric representing symbols and special characters).

Here's the method to call the webservice:

public void updateDatabaseFromWebservice(){

    // get data from webservice
    Log.i(TAG, "Obtaining categories from webservice");

    HttpClient client = new DefaultHttpClient();
    HttpGet request = new HttpGet(ConnectionProperties.CATEGORIES_URI);

    ResponseHandler<String> handler = new BasicResponseHandler();

    String result = "";
    String decoded;
    String encoded;
    try {                   
        result = client.execute(request, handler);  

        decoded = URLDecoder.decode(result, "UTF-8");
        encoded = URLEncoder.encode(result, "UTF-8");
        String c = "AS";

    } catch (Exception e) {  
        Log.e(TAG, "An error occurred while obtaining categories", e);
    }

    client.getConnectionManager().shutdown();
}

Any help would be appreciated

回答1:

Use this to get xml string, assuming the server encodes data in UTF-8:

HttpResponse response = client.execute(request);
... // probably some other code to check for HTTP response status code
HttpEntity responseEntity = response.getEntity();
String xml = EntityUtils.toString(responseEntity, HTTP.UTF_8);


回答2:

Uh. URLDecoder and encoder are for encoding and decoding URLs, not XML content. It is used for URL you use when making requests. So code is just... wrong.

But even bigger issue is that you are taking a String, whereas content is really XML which needs to be parsed. And for parser to do proper decoding of UTF-8 (and handling of entities etc), you would be better of getting a byte[] from request, passing that to parser; although asking http client to do decoding may work ok (assuming service correctly indicates encoding used; not all do -- but even if not, XML parsers can figure it out from xml declaration).

So: remove URLDecoder/URLEncoder stuff, parser XML, and extract data you want from XML.