read txt file in c++ (chinese)

2019-04-12 02:54发布

问题:

I'm trying to develop function that check whether chinese word which user enters is in the txt file or not. The following is the code. But it is not working. I want to know what the problem is. Help me please.

setlocale(LC_ALL, "Chinese-simplified");
locale::global(locale("Chinese_China"));
SetConsoleOutputCP(936);
SetConsoleCP(936);

bool exist = FALSE;

cout << "\n\n <Find the keyword whether it is in that image or not> \n ";
cout << "Enter word to search for: ";
wstring search;
wcin >> search; //There is a problem to enter chinese.

wfstream file_text("./a.txt");
wstring line;
wstring::size_type pos;

while (getline(file_text, line))
{
    pos = line.find(search);
    if (pos != wstring::npos) // string::npos is returned if string is not found
    {
        cout << "Found!" << endl;
        exist = true;
        break;
    }
}

when I use this code, The result is as follows.

const int oldMbcp = _getmbcp();
_setmbcp(936);
const std::locale locale("Chinese_China.936");
_setmbcp(oldMbcp);

回答1:

Try locale::global(locale("Chinese_China.936")); or locale::global(locale("")); And for LC_ALL "chinese-simplified" or "chs"



回答2:

If you're interested in more details, please see stod-does-not-work-correctly-with-boostlocale for a more detailed description of how locale works,

In a nutshell the more interesting part for you:

  1. std::stream (stringstream, fstream, cin, cout) has an inner locale-object, which matches the value of the global C++ locale at the moment of the creation of the stream object. As std::in is created long before your code in main is called, it has most probably the classical C locale, no matter what you do afterwards.
  2. you can make sure, that a std::stream object has the desirable locale by invoking std::stream::imbue(std::locale(your_favorit_locale)).

I would like to add the following:

  1. It is almost never a good idea to set the global locale - it might break other parts of the program or third part libraries - you never know.

  2. std::setlocale and locale::global do slightly different things, but locale::global resets not only the global c++-locale but also the c-locale (which is also set by std::setlocale, not to be confused with the classical "C" locale), so you should call it in another order if you want to have c++ locale set to Chinese_China and C locale to chinese-simplified

First locale::global(locale("Chinese_China"));

And than setlocale(LC_ALL, "Chinese-simplified");



回答3:

If using Vladislav's answer does not solve this, take a look at answer to stl - Shift-JIS decoding fails using wifstrem in Visual C++ 2013 - Stack Overflow:

const int oldMbcp = _getmbcp();
_setmbcp(936);
const std::locale locale("Chinese_China.936");
_setmbcp(oldMbcp);

There appears to be a bug in Visual Studio's implementation of locales. See also c++ - double byte character sequence conversion issue in Visual Studio 2015 - Stack Overflow.



标签: c++ locale