I'm having some encoding problems with a code I'm working on. An encrypted string is received which is decoded with ISO-8859-1. This string is then put into a DB which has UTF-8 encoding. When this string is retrieved it's still ISO-8859-1, and there's no problems. The issue is that I also need to be able to retrieve this string as UTF-8, but I haven't been successfull in this.
I've tried to convert the string from ISO to UTF-8 when retrieved from the DB using this method:
private String convertIsoToUtf8(String isoLatin) {
try {
return new String(isoLatin.getBytes("ISO_8859_1"), "UTF_8");
} catch (UnsupportedEncodingException e) {
return isoLatin;
}
}
Unfortunately, the special characters are just displayed as question-marks in this case.
Original string: Test æøå Example output after retriving from DB and converting to UTF-8: Test ???
Update: After reading the link provided in the comment, I managed to get it right. Since the DB is already UTF-8 encoded, all I needed to do was this:
return new String(isoLatin.getBytes("UTF-8"));
When you already have a
String
-object it is usually too late to correct any encoding-issues since some information may already have been lost - think of characters that can't be mapped one-to-one onto to java's internal UTF-16 representation.The correct place to handle character-ecoding is the moment you get your Strings: when reading input from a file (set the correct encoding on your
InputStreamReader
), when converting thebyte[]
you got from decryption, when reading from the database (this should be handeled by your JDBC-driver) etc.Also take care to correctly handle the encoding when doing the reverse. While it might seem to work OK most of the time when you use the default-encoding you might run into issues sooner or later that become difficult to impossible to resolve (as you do now).
P.S.: also keep in mind what tool you are using to display your output: some consoles won't display UTF-16 or UTF-8, check the encoding-settings of the editor you use to view your files etc. Sometimes your output might be correct and just can't be displayed correctly.