Converting C-Strings from Local Encoding to UTF8

2019-07-18 05:13发布

I'm writing a small App in which i read some text from to console, which is then stored in a classic char* string.
As it happens i need to pass it to an lib which only takes UTF-8 encoded Strings. Since the Windows console uses the local Encoding, i need to convert from local encoding to UTF-8.
If i'm not mistaken i could use MultiByteToWideChar(..) to encode to UTF-16 and then use WideCharToMultiByte(..) to Convert to UTF-8.

However i wonder if there is a way to convert directly from local Encoding to UTF-8 without the use of any external Libs, since the idea of converting to wchar just to be able to convert back to char (utf-8 encoded but still) seems kinda weird to me.

2条回答
Ridiculous、
2楼-- · 2019-07-18 05:32

The POSIX world loves the iconv lib for just that. It converts from and to virtually every encoding around, using char*.

查看更多
疯言疯语
3楼-- · 2019-07-18 05:37

Converting from UTF-16 to UTF-8 is purely a mechanical process, but converting from local encoding to UTF-16 or UTF-8 involves some large specialized lookup tables. The c-runtime just turns around and calls WideCharToMultiByte and MultiByteToWideChar for non-trivial cases.

As for having to use UTF-16 as an intermediate stage, as far as I know, there isn't any way around that - sorry.

Since you are already linking to an external library to get file input, you might as well link to the same library to get WideCharToMultiByte and MultiByteToWideChar.

Using the c-runtime will make your code re-compilable to other operating systems (in theory), but it also adds a layer of overhead between you and the library that does all of the real work in this case - kernel32.dll.

查看更多
登录 后发表回答