C++: wide characters outputting incorrectly?

2019-02-07 00:33发布

问题:

My code is basically this:

wstring japan = L"日本";
wstring message = L"Welcome! Japan is ";

message += japan;

wprintf(message.c_str());

I'm wishing to use wide strings but I do not know how they're outputted, so I used wprintf. When I run something such as:

./widestr | hexdump

The hexidecimal codepoints create this:

65 57 63 6c 6d 6f 21 65 4a 20 70 61 6e 61 69 20 20 73 3f 3f
e  W  c  l  m  o  !  e  J     p  a  n  a  i        s  ?  ?

Why are they all jumped in order? I mean if the wprintf is wrong I still don't get why it'd output in such a specific jumbled order!

edit: endianness or something? they seem to rotate each two characters. huh.

EDIT 2: I tried using wcout, but it outputs the exact same hexidecimal codepoints. Weird!

回答1:

You need to define locale

    #include <stdio.h>
    #include <string>
    #include <locale>
    #include <iostream>

    using namespace std;

    int main()
    {

            std::locale::global(std::locale(""));
            wstring japan = L"日本";
            wstring message = L"Welcome! Japan is ";

            message += japan;

            wprintf(message.c_str());
            wcout << message << endl;
    }

Works as expected (i.e. convert wide string to narrow UTF-8 and print it).

When you define global locale to "" - you set system locale (and if it is UTF-8 it would be printed out as UTF-8 - i.e. wstring will be converted)

Edit: forget what I said about sync_with_stdio -- this is not correct, they are synchronized by default. Not needed.