I have got a problem with printing out a unicode symbol in the windows console.
Here's the java code that prints out the unicode symbol value;
System.out.print("\u22A2 ");
The problem doesn't exist when I run the program in Eclipse with encoding settings as UTF-8, however when it comes to windows console the symbol gets replaced by a question mark.
The following was done to try overcome this problem, with no success;
Change the font of windows console to Lucida Console.
Every time I run windows console I will change the encoding settings, i.e. with the use of chcp 65001
An extra step I've tried a few times was running the java file with an argument, i.e. java -Dfile.encoding=UTF-8 Filter
(where "Filter" is name of the class)
By default, the code-page using in the CMD of Windows is 437. You can test by run this command in the prompt:
C:\>chcp
Active code page: 437
And, this code-page prevent you from showing Unicode characters properly! You have to change code page to 65001 AND using -Dfile.encoding=UTF-8 for that purpose.
C:\>chcp 65001
Active code page: 65001
C:\>java -jar -Dfile.encoding=UTF-8 path/to/your/runnable/jar
In additions to the steps you have taken, you also need a PrintStream/PrintWriter that encodes the printed characters to UTF-8.
Unfortunately, Java designers have chosen to open the standard streams with the so called "default" encoding, which is almost always unusable*) under Windows. Hence, using System.out
and System.err
naively will make your program output appear differently, depending on where you run it. This is straight against the goal: compile once, run anywhere.
*) It will be some non standard "code page" nobody except Microsoft recognizes on this planet. And AFAIK, if for example you have a German keyboard and a "German" OEM Windows and you want to have date and time in your home time zone, there is just no way to say: But I want UTF-8 input/output in my CMD window. This is one reason why I have my dual Ubuntu booted most of the time, where it goes without saying that the terminal does UTF-8.
The following usually works for me in JDK7:
public static PrintWriter stdout = new PrintWriter(
new OutputStreamWriter(System.out, StandardCharsets.UTF_8),
true);
For ancient Java versions, I replace StandardCharsets.UTF_8
by Charset.forName("UTF-8")
For the Arabic language I used the following code:
PrintWriter stdout = new PrintWriter(
new OutputStreamWriter(System.out,StandardCharsets.ISO_8859_1),true);