unicode characters appear as question marks in Int

2019-01-27 17:12发布

问题:

I'm trying to write unicode characters (♠) using System.out, and a question mark gets printed instead.

How can I have proper unicode characters displayed instead of question marks?

I'm using IntelliJ IDEA on Windows, and trying to print from within the IDE.

回答1:

Is the file encoding configured correctly? See that "Settings | File Encodings" uses UTF-8. Printing ♠ works for me when I have IDE encoding and all files set to UTF-8. Recompiling may be needed after changing the encoding.



回答2:

A little update for the year 2015

TL;DR answer:

Go to Settings -> Editor -> File Encodings -> Project Encoding and set it to "UTF-8".

Expanded answer:

The reason why it does not work can be found by placing a breakpoint on a System.out.print() call. When the breakpoint hits, you can add System.out to Watches, and you can see that System.out.textOut.out.se.cs is set to windows-1252 or something similarly unsuitable.

The setting which magically worked for me (I do not know why) is in Settings -> Editor -> File Encodings -> Project Encoding. You need to set that to "UTF-8".

Then, unicode characters display properly on the console, and one more quick look with the debugger shows that the value of System.out.textOut.out.se.cs has magically turned into UTF-8.

I am saying "magically" because I do not see how and why an editor setting should affect the character set that System.out gets instantiated with when launching/debugging an application. If someone knows what is the logic behind this, please do tell!



回答3:

System.out uses the default encoding of the underlying operating system which typically is ISO-8859-1 or UTF-8. The first is what I have with the 1252 codepage under XP.

Is this in a CMD.EXE window, or inside an IDE?



回答4:

If you ultimately want to print a wide range of Unicode characters on a standard command line on Windows, there is a bit of work involved. The default raster font will not support the characters and applications usually need to call the Unicode console API to render them. Java does not - it will first encode the characters to the native character set (a lossy process) and then emit them using an ANSI call. You can read this blog post if you want the gory details.



回答5:

The settings are not configured correctly, make sure that the your File encodings are set to UTF-8, in particular the 'Default encoding properties files'. On my side the steps below didnt work until I changed the default encoding.

On Intellij 2017.3.1 (Community Edition) File | Settings | Editor > File Encodings . Click on link below to see settings image

Intellij 2017.3.1 (Community Edition)



回答6:

  1. Do what @Mike Nakis said in his answer with going to Settings and changing project Encoding.

  2. In my case I had to do an additional step:

In the bottom right corner of IntelliJ IDEA switch encoding to UTF-8

Note that if you had some text pasted in your class with a different encoding (in my case I pasted a block of Cyrillic text right from the browser) it will become ugly formatted but that's ok. Just remove it and after switching to UTF-8 put it back.



回答7:

Do you have the appropriate fonts installed on your machine? The question mark appears when you don't have fonts for the characters you're outputting.

Also, are you outputting by doing a System.out.println("") ? If so, just installing the fonts should work.

If you are trying to write to System.out from within your program, that's different. You have to use an OutputStreamWriter, which is a character stream. You can't just write to a byte-oriented stream such as OutputStream.

Look up the API and class reference for OutputStreamWriter and subclasses such as PrintWriter. You construct it giving the locale of the constructor. For example,

PrintWriter pw = new PrintWriter(System.out, "UTF-8");