Unicode string literals in C# vs C++/CLI

2020-04-30 02:22发布

问题:

C#:
char z = '\u201D';
int i = (int)z;

C++/CLI:
wchar_t z = '\u201D';
int i = (int)z;

In C# "i" becomes, just as I expect, 8221 ($201D). In C++/CLI on the other hand, it becomes 65428 ($FF94). Can some kind soul explain this to me?

EDIT: Size of wchar_t can not be of issue here, because:

C++/CLI:
wchar_t z = (wchar_t)8221;
int i = (int)z;

Here too, i becomes 8221, so wchar_t is indeed up to the game of holding a 16-bit integer on my system. Ekeforshus

回答1:

You want:

wchar_t z = L'\x201D';

from here. \u is undefined.



回答2:

According to wikipedia:

"The width of wchar_t is compiler-specific and can be as small as 8 bits. Consequently, programs that need to be portable across any C or C++ compiler should not use wchar_t for storing Unicode text. The wchar_t type is intended for storing compiler-defined wide characters, which may be Unicode characters in some compilers."

You shouldn't make any assumptions about how it's implemented.