I need to convert utf16 text to utf8. The actual conversion code is simple:
std::wstring in(...);
std::string out = boost::locale::conv::utf_to_utf<char, wchar_t>(in);
However the issue is that the UTF16 is read from a file and it may or may not contain BOM. My code needs to be portable (minimum is windows/osx/linux). I'm really struggling to figure out how to create a wstring
from the byte sequence.
EDIT: this is not a duplicate of the linked question, as in that question the OP needs to convert a wide string into an array of bytes - and I need to convert the other way around.
You should not use wide types at all in your case.
Assuming you can get a
char *
from yourvector<char>
, you can stick to bytes by using the following code:between operates on 8-bit characters and allows you to avoid conversion to 16-bit characters altogether.
It is necessary to use the
between
overload that uses the pointer to the buffer's end, because by default,between
will stop at the first'\0'
character in the string, which will be almost immediately because the input is UTF-16.