convert unicode to char

2019-02-20 03:24发布

问题:

How can I convert a Unicode string to a char* or char* const in embarcadero c++ ?

回答1:

"Unicode string" really isn't specific enough to know what your source data is, but you probably mean 'UTF-16 string stored as wchar_t array' since that's what most people who don't know the correct terminology use.

"char*" also isn't enough to know what you want to target, although maybe "embarcadero" has some convention. I'll just assume you want UTF-8 data unless you mention otherwise.

Also I'll limit my example to what works in VS2010

// your "Unicode" string
wchar_t const * utf16_string = L"Hello, World!";

// #include <codecvt>
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>,wchar_t> convert;

std::string utf8_string = convert.to_bytes(utf16_string);

This assumes that wchar_t strings are UTF-16, as is the case on Windows, but otherwise is portable code.



回答2:

String text = "Hello world";
char *txt = AnsiString(text).c_str();

Older text.t_str() is now AnsiString(String).c_str()


回答3:

You can reinterpret any array as an array of char pointers legally. So if your Unicode data comes in 4-byte code units like

char32_t data[100];

then you can access it as a char array:

char const * p = reinterpret_cast<char const*>(data);

for (std::size_t i = 0; i != sizeof data; ++i)
{
    std::printf("Byte %03zu is 0x%02X.\n", i, p[i]);
}

That way, you can examine the individual bytes of your Unicode data one by one.

(That has of course nothing to do with converting the encoding of your text. For that, use a library like iconv or ICU.)



回答4:

If you work with Windows:

//#include <windows.h>
u16string utext = u"объява";
char text[0x100];
WideCharToMultiByte(CP_UTF8,NULL,(const wchar_t*)(utext.c_str()),-1,text,-1,NULL,NULL);
cout << text;

We can't use std::wstring_convert, wherefore is not available in MinGW 4.9.2.