Output unicode strings in Windows console app

2018-12-31 03:04发布

Hi I was trying to output unicode string to a console with iostreams and failed.

I found this: Using unicode font in c++ console app and this snippet works.

SetConsoleOutputCP(CP_UTF8);
wchar_t s[] = L"èéøÞǽлљΣæča";
int bufferSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL);
char* m = new char[bufferSize]; 
WideCharToMultiByte(CP_UTF8, 0, s, -1, m, bufferSize, NULL, NULL);
wprintf(L"%S", m);

However, I did not find any way to output unicode correctly with iostreams. Any suggestions?

This does not work:

SetConsoleOutputCP(CP_UTF8);
utf8_locale = locale(old_locale,new boost::program_options::detail::utf8_codecvt_facet());
wcout.imbue(utf8_locale);
wcout << L"¡Hola!" << endl;

EDIT I could not find any other solution than to wrap this snippet around in a stream. Hope, somebody has better ideas.

//Unicode output for a Windows console 
ostream &operator-(ostream &stream, const wchar_t *s) 
{ 
    int bufSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL);
    char *buf = new char[bufSize];
    WideCharToMultiByte(CP_UTF8, 0, s, -1, buf, bufSize, NULL, NULL);
    wprintf(L"%S", buf);
    delete[] buf; 
    return stream; 
} 

ostream &operator-(ostream &stream, const wstring &s) 
{ 
    stream - s.c_str();
    return stream; 
} 

10条回答
与风俱净
2楼-- · 2018-12-31 03:46

First, sorry I probably don't have the fonts required so I cannot test it yet.

Something looks a bit fishy here

// the following is said to be working
SetConsoleOutputCP(CP_UTF8); // output is in UTF8
wchar_t s[] = L"èéøÞǽлљΣæča";
int bufferSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL);
char* m = new char[bufferSize]; 
WideCharToMultiByte(CP_UTF8, 0, s, -1, m, bufferSize, NULL, NULL);
wprintf(L"%S", m); // <-- upper case %S in wprintf() is used for MultiByte/utf-8
                   //     lower case %s in wprintf() is used for WideChar
printf("%s", m); // <-- does this work as well? try it to verify my assumption

while

// the following is said to have problem
SetConsoleOutputCP(CP_UTF8);
utf8_locale = locale(old_locale,
                     new boost::program_options::detail::utf8_codecvt_facet());
wcout.imbue(utf8_locale);
wcout << L"¡Hola!" << endl; // <-- you are passing wide char.
// have you tried passing the multibyte equivalent by converting to utf8 first?
int bufferSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL);
char* m = new char[bufferSize]; 
WideCharToMultiByte(CP_UTF8, 0, s, -1, m, bufferSize, NULL, NULL);
cout << m << endl;

what about

// without setting locale to UTF8, you pass WideChars
wcout << L"¡Hola!" << endl;
// set locale to UTF8 and use cout
SetConsoleOutputCP(CP_UTF8);
cout << utf8_encoded_by_converting_using_WideCharToMultiByte << endl;
查看更多
骚的不知所云
3楼-- · 2018-12-31 03:47

I don't think there is an easy answer. looking at Console Code Pages and SetConsoleCP Function it seems that you will need to set-up an appropriate codepage for the character-set you're going to output.

查看更多
余生请多指教
4楼-- · 2018-12-31 03:48

Recenly I wanted to stream unicode from Python to windows console and here is the minimum I needed to make:

  • You should set console font to the one covering unicode symbols. There is not a wide choise: Console properties > Font > Lucida Console
  • You should change the current console codepage: run chcp 65001 in the Console or use the corresponding method in the C++ code
  • write to console using WriteConsoleW

Look through an interesing article about java unicode on windows console

Besides, in Python you can not write to default sys.stdout in this case, you will need to substitute it with something using os.write(1, binarystring) or direct call to a wrapper around WriteConsoleW. Seems like in C++ you will need to do the same.

查看更多
余欢
5楼-- · 2018-12-31 03:48

I had a similar problem, Output Unicode to console Using C++, in Windows contains the gem that you need to do chcp 65001 in the console before running your program.

There may be some way of doing this programatically, but I don't know what it is.

查看更多
登录 后发表回答