Can't copy unicode(used wchar_t) in HTML forma

2019-09-20 01:39发布

问题:

Copying to clipboard in HTML format works when I use char, but if I use wchar_t it doesn't work

When I paste it it's just EMPTY

here is my code Plase Help me

Or is there a better way to use unicode(not using wchar_t)? If you do help me

void copyStringEnd(wchar_t *string, wchar_t *buffer)
{
    int i = 0;
    int string_StartIndex = 0;

    while (string[string_StartIndex] != NULL)
    {
        string_StartIndex++;
    }

    while (buffer[i] != NULL)
    {
        string[string_StartIndex + i] = buffer[i];
        i++;
    }
    string[string_StartIndex + i] = '\0';
}

int main()
{
    wchar_t *html = L"abc";
    wchar_t *buf = (wchar_t*)malloc(400 + wcslen(html));

    wcscpy_s(buf, 200,
            L"Version:0.9\r\n"
            L"StartHTML:00000000\r\n"
            L"EndHTML:00000000\r\n"
            L"StartFragment:00000000\r\n"
            L"EndFragment:00000000\r\n"
            L"<html><body>\r\n"
            L"<!--StartFragment -->\r\n");

    copyStringEnd(buf, html);
    copyStringEnd(buf, L"\r\n");

    copyStringEnd(buf,
            L"<!--EndFragment-->\r\n"
            L"</body>\r\n"
            L"</html>");

    wchar_t *ptr = wcsstr(buf, L"StartHTML");
    wsprintfW(ptr + 10, L"%08u", wcsstr(buf, L"<html>") - buf);
    *(ptr + 10 + 8) = '\r';

    ptr = wcsstr(buf, L"EndHTML");
    wsprintfW(ptr + 8, L"%08u", wcslen(buf));
    *(ptr + 8 + 8) = '\r';

    ptr = wcsstr(buf, L"StartFragment");
    wsprintfW(ptr + 14, L"%08u", wcsstr(buf, L"<!--StartFrag") - buf);
    *(ptr + 14 + 8) = '\r';

    ptr = wcsstr(buf, L"EndFragment");
    wsprintfW(ptr + 12, L"%08u", wcsstr(buf, L"<!--EndFrag") - buf);
    *(ptr + 12 + 8) = '\r';



    if (OpenClipboard(NULL)) {
        EmptyClipboard();

        HGLOBAL hText = GlobalAlloc(GMEM_MOVEABLE | GMEM_DDESHARE, wcslen(buf) * sizeof(wchar_t) + 4);

        wchar_t *ptrs = (wchar_t *)GlobalLock(hText);
        wcscpy_s(ptrs, wcslen(buf) + 2, buf);
        GlobalUnlock(hText);

        SetClipboardData(RegisterClipboardFormatA("HTML Format"), hText);
        CloseClipboard();
        GlobalFree(hText);
    }
    free(buf);
}

回答1:

According to the documentation:

The only character set supported by the clipboard is Unicode in its UTF-8 encoding.

What you're calling "Unicode" is UTF-16-LE. Which is not UTF-8. If you try to interpret UTF-16-LE as UTF-8, it's going to look like it starts with "V\0", and then most code will just treat that \0 as the end of the string and stop reading.

You need to encode it to UTF-8—which is stored as char, not wchar_t—and paste that.