My software supports multiple languages (English, German, Polish, Russian, ...).
For this reason I have some language specific files with the dialog texts in the specific language (Encoded as UTF-8).
In my mfc application I open and read those files and insert the text into my AfxMessageBoxes and other UI-Windows.
// Get the codepage number. 65001 = UTF-8
// In the real code this is a parameter in the function I call (just for clarification)
LANGID languageID = 65001;
TCHAR szCodepage[10];
GetLocaleInfo (MAKELCID (languageID, SORT_DEFAULT), LOCALE_IDEFAULTANSICODEPAGE, szCodepage, 10);
int nAnsiCodePage = _ttoi (szCodepage);
// Open the file
CFile file;
CString filename = getName();
if (!file.Open(FileName, CFile::modeRead, NULL))
{
//Check if everything is fine, else break
}
// Read the file
CString inString;
int len = file.GetLength ();
UINT n = file.Read (inString.GetBuffer(len), len);
inString.ReleaseBuffer ();
int size = MultiByteToWideChar (CP_ACP, 0, strAllItems, -1, NULL, 0);
WCHAR *ubuf = new WCHAR[size + 1];
MultiByteToWideChar ((UINT) nAnsiCodePage, (nAnsiCodePage == CP_UTF8 ?
0 : MB_PRECOMPOSED), inString, -1, ubuf, (int) size);
outString = ubuf;
file.Close ();
Result:
This mechanism is working fine for special letters of russian and german, but not for polish. I already checked the utf-8 site (http://www.utf8-chartable.de/unicode-utf8-table.pl?number=1024) and the polish characters are part of it.
I also checked the hex values of my CString and everything seems to be alright, but it is not visualized in the correct way. Just for testing I changed the used codepage from utf-8 to 1250 (Eastern Europe, Polish included) and it also did not work.
What am I doing wrong?
EDIT:
When I use:
MultiByteToWideChar (CP_UTF8 , 0, inString, -1, ubuf, (int) size);
The hex-values are shortend to the "best match" letters. Meaning my result is: mezczyzna
I am using windows 7 with the english language selected.