I have a little Java project where I've set the properties of the class files to UTF-8 (I use a lot of foreign characters not found on the default CP1252).
The goal is to create a text file (in Windows) containing a list of items. When running the class files from Eclipse itself (hitting Ctrl+F11) it creates the file flawlessly and opening it in another editor (I'm using Notepad++) I can see the characters as I wanted.
┌──────────────────────────────────────────────────┐
│ Universidade2010 (18/18)│
│ hidden: 0│
├──────────────────────────────────────────────────┤
But, when I export the project (using Eclipse) as a runnable Jar and run it using 'javaw -jar project.jar' the new file created is a mess of question marks
????????????????????????????????????????????????????
? Universidade2010 (19/19)?
? hidden: 0?
????????????????????????????????????????????????????
I've followed some tips on how to use UTF-8 (which seems to be broken by default on Java) to try to correct this so now I'm using
Writer w = new OutputStreamWriter(fos, "UTF-8");
and writing the BOM header to the file like in this question already answered but still without luck when exporting to Jar
Am I missing some property or command-line command so Java knows I want to create UTF-8 files by default ?
the problem is not on the creating the file itself , because while developing the file is outputted correctly (with the unicode characters)
The class that creates the file is now (and following the suggestion of using the Charset class) like this:
public class Printer {
File f;
FileOutputStream fos;
Writer w;
final byte[] utf8_bom = { (byte) 0xEF, (byte) 0xBB, (byte) 0xBF };
public Printer(String filename){
f = new File(filename);
try {
fos = new FileOutputStream(f);
w = new OutputStreamWriter(fos, Charset.forName("UTF-8"));
fos.write(utf8_bom);
} catch (FileNotFoundException e) {
} catch (IOException e) {
e.printStackTrace();
}
}
public void print(String s) {
if(fos != null){
try {
fos.write(s.getBytes());
fos.flush();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
And all characters being used are defined as such:
private final char pipe = '\u2502'; /* │ */
private final char line = '\u2500'; /* ─ */
private final char pipeleft = '\u251c'; /* ├ */
private final char piperight = '\u2524'; /* ┤ */
private final char cupleft = '\u250c'; /* ┌ */
private final char cupright = '\u2510'; /* ┐ */
private final char cdownleft = '\u2514'; /* └ */
private final char cdownright = '\u2518'; /* ┘ */
The problem remains, when outputting to a file simply by running the project on Eclipse, the file comes out perfect, but after deploying the project to a Jar and running it the outputted file has the formatting destroyed (I've found out that they are replaced by the '?' char)
I've come to thinking this is not a problem with the code, is a problem from deploying it into a Jar file, I think Eclipse is compiling the source files to CP1252 or something, but even replacing all unicode chars by their code constants didn't help