I'm looking for a collection of functions for performing UTF character conversion in C++11. It should include conversion to and from any of utf8, utf16, and utf32. A function for recognizing byte order marks would be helpful, too.
相关问题
- Sorting 3 numbers without branching [closed]
- How to compile C++ code in GDB?
- Why does const allow implicit conversion of refere
- thread_local variables initialization
- What uses more memory in c++? An 2 ints or 2 funct
相关文章
- Class layout in C++: Why are members sometimes ord
- How to mock methods return object with deleted cop
- Which is the best way to multiply a large and spar
- C++ default constructor does not initialize pointe
- Selecting only the first few characters in a strin
- What exactly do pointers store? (C++)
- Converting glm::lookat matrix to quaternion and ba
- What is the correct way to declare and use a FILE
Here's my UTF-8 code from Baby X (https://github.com/MalcolmMcLean/babyx)
I've written a little utf_ranges library for doing just this. It uses Range-V3 and C++14.
It has both views and actions (if you're familiar with Range-V3 terminology) for converting between any of the three main UTF encodings, can consume and generate byte order marks, and perform endian conversion based on a bom. For example, reading a file from unknown-endian UTF-16 into a UTF-8
std::string
, converting any of the seven unicode line endings to\n
, looks like this:Update: The functions listed here are maintained in a GitHub repo, .hpp, .cpp and tests. Some UTF-16 functions have been disable because they do not work correctly. The "banana" tests in the utf.test.cpp file demonstrate the problem.
Also included a "read_with_bom" function for recognizing byte order marks.