Unicode to UTF-8 in C++

2019-03-31 05:03发布

I searched a lot, but couldn't find anything:

unsigned int unicodeChar = 0x5e9;
unsigned int utf8Char;
uni2utf8(unicodeChar, utf8Char);
assert(utf8Char == 0xd7a9);

Is there a library (preferably boost) that implements something similar to uni2utf8?

4条回答
We Are One
2楼-- · 2019-03-31 05:11

Unicode conversions are part of C++11:

#include <codecvt>
#include <locale>
#include <string>
#include <cassert>

int main() {
  std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> convert;
  std::string utf8 = convert.to_bytes(0x5e9);
  assert(utf8.length() == 2);
  assert(utf8[0] == '\xD7');
  assert(utf8[1] == '\xA9');
}
查看更多
等我变得足够好
3楼-- · 2019-03-31 05:14

Boost.Locale has also functions for encoding conversions:

#include <boost/locale.hpp>

int main() {
  unsigned int point = 0x5e9;
  std::string utf8 = boost::locale::conv::utf_to_utf<char>(&point, &point + 1);
  assert(utf8.length() == 2);
  assert(utf8[0] == '\xD7');
  assert(utf8[1] == '\xA9');
}
查看更多
唯我独甜
4楼-- · 2019-03-31 05:17

You might want to give a try to UTF8-CPP library. Encoding a Unicode character with it would look like this:

std::wstring unicodeChar(L"\u05e9");
std::string utf8Char;
encode_utf8(unicodeChar, utf8Char);

std::string is used here just as a container for UTF-8 bytes.

查看更多
The star\"
5楼-- · 2019-03-31 05:27

Use sprintf. (:

cstring = sprintf("%S", unicodestring);

查看更多
登录 后发表回答