I am trying to compare std::string
s in a locale-dependent manner.
For ordinary C-style strings, I've found strcoll
, which does exactly what I want, after doing std::setlocale
#include <iostream>
#include <locale>
#include <cstring>
bool cmp(const char* a, const char* b)
{
return strcoll(a, b) < 0;
}
int main()
{
const char* s1 = "z", *s2 = "å", *s3 = "ä", *s4 = "ö";
std::cout << (cmp(s1,s2) && cmp(s2,s3) && cmp(s3,s4)) << "\n"; //Outputs 0
std::setlocale(LC_ALL, "sv_SE.UTF-8");
std::cout << (cmp(s1,s2) && cmp(s2,s3) && cmp(s3,s4)) << "\n"; //Outputs 1, like it should
return 0;
}
However, I'd like to have this behaviour for std::string
as well. I could just overload operator<
to do something like
bool operator<(const std::string& a, const std::string& b)
{
return strcoll(a.c_str(), b.c_str());
}
but then I'd have to worry about code using std::less
and std::string::compare
, so it doesn't feel right.
Is there a way to make this kind of collation work for strings in a seamless manner?
The C++ library provides the collate facet to do locale-specific collation.
After a bit of searching around I realized that one way to do it could be to overload the
std::basic_string
template to make a new, localized string class.There is probably a gazillion bugs in this, but as a proof of concept:
Howerver, it doesn't seem to work if you base it on
char
instead ofwchar_t
and I have no idea why...In C++ you need to use the standard collate facet. Check it out.
operator() of std::locale is just what you are searching. To get the current global locale, just use the default constructor.