I've recently had to switch encoding of webapp I'm working on from ISO-xx
to utf8
. Everything went smooth, except properties files. I added -Dfile.encoding=UTF-8
in eclipse.ini
and normal files work fine. Properties however show some strange behaviour.
If I copy utf8
encoded properties from Notepad++ and paste them in Eclipse, they show and work fine. When I reopen properties file, I see some Unicode characters instead of proper ones, like:
Zur\u00EF\u00BF\u00BDck instead of Zurück
but app still works fine. If I start to edit properties, add some special characters and save, they display correctly, however they don't work and all previously working special characters don't work any more.
When I compare local version with CVS I can see special characters correctly on remote file and after update I'm at start again: app works, but Eclipse displays Unicode chars.
I tried changing file encoding by right clicking it and selecting „Other: UTF8” but it didn't help. It also said: „determined from content: ISO-8859-1”
I'm using Java 6 and Jboss Developer based on Eclipse 3.3
I can live with it by editing properties in Notepad++ and pasting them in Eclipse, but I would be grateful if someone could help me with fixing this in Eclipse.
There are too many points in the process you describe where errors can occur, so I won't try to guess what you're doing wrong, but I think I know what's happening under the hood.
EF BF BD
is the UTF-8 encoded form ofU+FFFD
, the standard replacement character that's inserted by decoders when they encounter malformed input. It sounds like your text is being saved as ISO-8859-1, then read as if it were UTF-8, then saved as UTF-8, then converted to the Properties format usingnative2ascii
using the platform default encoding (e.g., windows-1252).I suggest you leave the "file.encoding" property alone. Like "file.separator" and "line.separator", it's not nearly as useful as you would expect it to be. Instead, get into the habit of always specifying an encoding when reading and writing text files.
Works like a charm
:-)
There is much easier way:
this works well in java 1.6. How can i do this in 1.5, Since Properties class does not have a method to pars
InputStreamReader
.It is not a problem with Eclipse. If you are using the Properties class to read and store the properties file, the class will escape all special characters.
From the class documentation:
From the API, store() method:
If the properties are for XML or HTML, it's safest to use XML entities. They're uglier to read, but it means that the properties file can be treated as straight ASCII, so nothing will get mangled.
Note that HTML has entities that XML doesn't, so I keep it safe by using straight XML: http://www.w3.org/TR/html4/sgml/entities.html