I want to convert wstring
to u16string
in C++.
I can convert wstring
to string, or reverse. But I don't know how convert to u16string
.
u16string CTextConverter::convertWstring2U16(wstring str)
{
int iSize;
u16string szDest[256] = {};
memset(szDest, 0, 256);
iSize = WideCharToMultiByte(CP_UTF8, NULL, str.c_str(), -1, NULL, 0,0,0);
WideCharToMultiByte(CP_UTF8, NULL, str.c_str(), -1, szDest, iSize,0,0);
u16string s16 = szDest;
return s16;
}
Error in WideCharToMultiByte(CP_UTF8, NULL, str.c_str(), -1, szDest, iSize,0,0);'s szDest
. Cause of u16string
can't use with LPSTR
.
How can I fix this code?
For a platform-independent solution see this answer.
If you need a solution only for the Windows platform, the following code will be sufficient:
On the Windows platform, a
std::wstring
is interchangeable withstd::u16string
becausesizeof(wstring::value_type) == sizeof(u16string::value_type)
and both are UTF-16 (little endian) encoded.The only difference being that
wchar_t
is signed, whereaschar16_t
is unsigned so you only have to do sign conversion, which can be performed using thew16string
constructor that takes an iterator pair as arguments. This constructor will implicitly convertwchar_t
tochar16_t
.Full example console application:
Update
I had thought the standard version did not work, but in fact this was simply due to bugs in the Visual C++ and libstdc++ 3.4.21 runtime libraries. It does work with
clang++ -std=c++14 -stdlib=libc++
. Here is a version that tests whether the standard method works on your compiler:Previous
A bit late to the game, but here’s a version that additionally checks whether
wchar_t
is 32-bits (as it is on Linux), and if so, performs surrogate-pair conversion. I recommend saving this source as UTF-8 with a BOM. Here is a link to it on ideone.Footnote
Since someone asked: The reason I suggest UTF-8 with BOM is that some compilers, including MSVC 2015, will assume a source file is encoded according to the current code page unless there is a BOM or you specify an encoding on the command line. No encoding works on all toolchains, unfortunately, but every tool I’ve used that’s modern enough to support C++14 also understands the BOM.