可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have an array consisting of unicode code points

unsigned short array[3]={0x20ac,0x20ab,0x20ac};

I just want this to be converted as utf-8 to write into file byte by byte using C++.

Example: 0x20ac should be converted to e2 82 ac.

or is there any other method that can directly write unicode characters in file.

回答1:

Finally! With C++11!

#include <string>
#include <locale>
#include <codecvt>
#include <cassert>

int main()
{
    std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> converter;
    std::string u8str = converter.to_bytes(0x20ac);
    assert(u8str == "\xe2\x82\xac");
}

回答2:

The term Unicode refers to a standard for encoding and handling of text. This incorporates encodings like UTF-8, UTF-16, UTF-32, UCS-2, ...

I guess you are programming in a Windows environment, where Unicode typically refers to UTF-16.

When working with Unicode in C++, I would recommend the ICU library.

If you are programming on Windows, don't want to use an external library, and have no constraints regarding platform dependencies, you can use WideCharToMultiByte.

Example for ICU:

#include <iostream>
#include <unicode\ustream.h>

using icu::UnicodeString;

int main(int, char**) {
    //
    // Convert from UTF-16 to UTF-8
    //
    std::wstring utf16 = L"foobar";
    UnicodeString str(utf16.c_str());
    std::string utf8;
    str.toUTF8String(utf8);

    std::cout << utf8 << std::endl;
}

To do exactly what you want:

// Assuming you have ICU\include in your include path
// and ICU\lib(64) in your library path.
#include <iostream>
#include <fstream>
#include <unicode\ustream.h>
#pragma comment(lib, "icuio.lib")
#pragma comment(lib, "icuuc.lib")

void writeUtf16ToUtf8File(char const* fileName, wchar_t const* arr, size_t arrSize) {
    UnicodeString str(arr, arrSize);
    std::string utf8;
    str.toUTF8String(utf8);

    std::ofstream out(fileName, std::ofstream::binary);
    out << utf8;
    out.close();
}

回答3:

Following code may help you,

#include <atlconv.h>
#include <atlstr.h>

#define ASSERT ATLASSERT

int main()
{
    const CStringW unicode1 = L"\x0391 and \x03A9"; // 'Alpha' and 'Omega'

    const CStringA utf8 = CW2A(unicode1, CP_UTF8);

    ASSERT(utf8.GetLength() > unicode1.GetLength());

    const CStringW unicode2 = CA2W(utf8, CP_UTF8);

    ASSERT(unicode1 == unicode2);
}

回答4:

This code uses WideCharToMultiByte (I assume that you are using Windows):

unsigned short wide_str[3] = {0x20ac, 0x20ab, 0x20ac};
int utf8_size = WideCharToMultiByte(CP_UTF8, 0, wide_str, 3, NULL, 0, NULL, NULL) + 1;
char* utf8_str = calloc(utf8_size);
WideCharToMultiByte(CP_UTF8, 0, wide_str, 3, utf8_str, utf8_size, NULL, NULL);

You need to call it twice: first time to get number of output bytes, and second time to actually convert it. If you know output buffer size, you may skip first call. Or, you can simply allocate buffer 2x larger than original + 1 byte (for your case it means 12+1 bytes) - it should be always enough.

回答5:

You could use Boost.Locale of Boost libraries: http://www.boost.org/doc/libs/1_55_0/libs/locale/doc/html/index.html

回答6:

With std c++

#include <iostream>
#include <locale>
#include <vector>

int main()
{
    typedef std::codecvt<wchar_t, char, mbstate_t> Convert;
    std::wstring w = L"\u20ac\u20ab\u20ac";
    std::locale locale("en_GB.utf8");
    const Convert& convert = std::use_facet<Convert>(locale);

    std::mbstate_t state;
    const wchar_t* from_ptr;
    char* to_ptr;
    std::vector<char> result(3 * w.size() + 1, 0);
    Convert::result convert_result = convert.out(state,
          w.c_str(), w.c_str() + w.size(), from_ptr,
          result.data(), result.data() + result.size(), to_ptr);

    if (convert_result == Convert::ok)
        std::cout << result.data() << std::endl;
    else std::cout << "Failure: " << convert_result << std::endl;
}