Given string foo
, I've written answers on how to use cctype
's tolower
to convert the characters to lowercase
transform(cbegin(foo), cend(foo), begin(foo), static_cast<int (*)(int)>(tolower))
But I've begun to consider locale
's tolower
, which could be used like this:
use_facet<ctype<char>>(cout.getloc()).tolower(data(foo), next(data(foo), foo.size()));
- Is there a reason to prefer one of these over the other?
- Does their functionality differ at all?
- I mean other than the fact that
tolower
accepts and returns anint
which I assume is just some antiquated C stuff?
In the first case (cctype) the locale is set implicitely:
http://en.cppreference.com/w/cpp/string/byte/tolower
In the second (locale's) case you have to explicitely set the locale:
http://www.cplusplus.com/reference/locale/tolower/
Unfortunately,both are equally bad. Although
std::string
pretends to be a utf-8 encoded string, non of the methods/function (including tolower), are really utf-8 aware. So,tolower
/tolower
+ locale may work with characters which are single byte (= ASCII), they will fail for every other set of languages.On Linux, I'd use ICU library. On Windows, I'd use
CharUpper
function.It should be noted that the language designers were aware of
cctype
'stolower
whenlocale
'stolower
was created. It improved in 2 primary ways:locale
version allowed the use of thefacet ctype
, even a user modified one, without requiring the shuffling in of a newLC_CTYPE
in viasetlocale
and the restoration of the previousLC_CTYPE
Which creates an the potential for undefined behavior with the
cctype
version oftolower
's if it's argument:So there is an additional input and output
static_cast
required by thecctype
version oftolower
yielding:Since the
locale
version operates directly onchar
s there is no need for a type conversion.So if you don't need to perform the conversion in a different
facet ctype
it simply becomes a style question of whether you prefer thetransform
with a lambda required by thecctype
version, or whether you prefer thelocale
version's: