I would like to convert a string variable to wstring due to some german characters that cause problem when doing a substr over the variable. The start position is falsified when any these special characters is present before it. (For instance: for "ä" size() returns 2 instead of 1)
I know that the following conversion works:
wstring ws = L"ä";
Since, I am trying to convert a variable, I would like to know if there is an alternative way for it such as
wstring wstr = L"%s"+str //this is syntaxically wrong, but wanted sth alike
Beside that, I have already tried the following example to convert string to wstring:
string foo("ä");
wstring_convert<codecvt_utf8<wchar_t>> converter;
wstring wfoo = converter.from_bytes(foo.data());
cout << foo.size() << endl;
cout << wfoo.size() << endl;
, but I get errors like
‘wstring_convert’ was not declared in this scope
I am using ubuntu 14.04 and my main.cpp is compiled with cmake. Thanks for your help!
The solution from "hahakubile" worked for me:
std::wstring s2ws(const std::string& s) {
std::string curLocale = setlocale(LC_ALL, "");
const char* _Source = s.c_str();
size_t _Dsize = mbstowcs(NULL, _Source, 0) + 1;
wchar_t *_Dest = new wchar_t[_Dsize];
wmemset(_Dest, 0, _Dsize);
mbstowcs(_Dest,_Source,_Dsize);
std::wstring result = _Dest;
delete []_Dest;
setlocale(LC_ALL, curLocale.c_str());
return result;
}
But the return value is not 100% correct:
string s = "101446012MaßnStörfall PAt #Maßnahme Störfall 00810000100121000102000020100000000000000";
wstring ws2 = s2ws(s);
cout << ws2.size() << endl; // returns 110 which is correct
wcout << ws2.substr(29,40) << endl; // returns #Ma�nahme St�rfall with symbols
I am wondering why it replaced german characters with symbols.
Thanks again!
If you are using Windows/Visual Studio and need to convert a string to wstring you should use:
#include <AtlBase.h>
#include <atlconv.h>
...
string s = "some string";
CA2W ca2w(s.c_str());
wstring w = ca2w;
printf("%s = %ls", s.c_str(), w.c_str());
Same procedure for converting a wstring to string (sometimes you will need to specify a codepage):
#include <AtlBase.h>
#include <atlconv.h>
...
wstring w = L"some wstring";
CW2A cw2a(w.c_str());
string s = cw2a;
printf("%s = %ls", s.c_str(), w.c_str());
You could specify a codepage and even UTF8 (that's pretty nice when working with JNI/Java).
CA2W ca2w(str, CP_UTF8);
If you want to know more about codepages there is an interesting article on Joel on Software: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets.
These CA2W (Convert Ansi to Wide=unicode) macros are part of ATL and MFC String Conversion Macros, samples included.
Sometimes you will need to disable the security warning #4995', I don't know of other workaround (to me it happen when I compiled for WindowsXp in VS2012).
#pragma warning(push)
#pragma warning(disable: 4995)
#include <AtlBase.h>
#include <atlconv.h>
#pragma warning(pop)
Edit:
Well, according to this article the article by Joel appears to be: "while entertaining, it is pretty light on actual technical details". Article: What Every Programmer Absolutely, Positively Needs To Know About Encoding And Character Sets To Work With Text.
The main point is that
string foo("ä")
Is already an error. Start from here and read all answers. And beware, one is very wrong :)