What utf format should boost wdirectory_iterator r

2019-08-22 16:00发布

问题:

If a file contains a £ (pound) sign then directory_iterator correctly returns the utf8 character sequence \xC2\xA3

wdirectory_iterator uses wide chars, but still returns the utf8 sequence. Is this the correct behaviour for wdirectory_iterator, or am I using it incorrectly?

AddFile(testpath, "pound£sign"); 
wdirectory_iterator iter(testpath);
TS_ASSERT_EQUALS(iter->leaf(),L"pound\xC2\xA3sign"); // Succeeds
TS_ASSERT_EQUALS(*iter, L"pound£sign"); // Fails

回答1:

The encoding for wide chars (wchar_t objects) is implementation dependent. For the second statement (i.e. L"pound£sign") to work, you will probably need to change the underlying locale. The default is "C" which does not know about the pound character. The hex value succeeds since this does not require mapping the glyph to a value in a particular encoding.

Note: I am skipping the exact wording of the standard w.r.t wchar_t, extended character sets etc for brevity.