std::tolower and Visual Studio 2013

2020-03-03 05:31发布

I try to understand how to use std::tolower...

#include <iostream>
#include <string>
#include <algorithm>
#include <locale>

int main()
{
    std::string test = "Hello World";
    std::locale loc;
    for (auto &c : test)
    {
        c = std::tolower(c, loc);
    }

    std::transform(test.begin(), test.end(), test.begin(), ::tolower); // 1) OK
    std::transform(test.begin(), test.end(), test.begin(), std::tolower); // 2) Cryptic compile error
    std::transform(test.begin(), test.end(), test.begin(), static_cast<int(*)(int)>(std::tolower)); // 3) Cryptic compile error. Seems OK with other compilers though

    return 0;
}

So:

  1. Why ::tolower version is working?
  2. Why std::tolower is not working in std::transform?
  3. What static_cast<int(*)(int)>(std::tolower)) really is trying to do? Why does it work with GCC and not with Visual Studio 2013?
  4. How could I use std::lower in std::transform with Visual Studio 2013 then?

2条回答
Viruses.
2楼-- · 2020-03-03 05:55

First off, note, that none of these approaches does the right thing in a portable way! The problem is that char may be signed (and typically is) but the versions of tolower() only accept positive values! That is you really want to use std::tolower() using something like this:

std::transform(test.begin(), test.end(), test.begin(),
               [](unsigned char c) { return std::tolower(c); });

(or, of course, using a corresponding function object if you are stuck with C++03). Using std::tolower() (or ::tolower() for that matter) with a negative value results in undefined behavior. Of course, this only matters on platform where char is signed which seems, however, to be the typical choice.

To answer your questions:

  1. When including <cctype> you typically get the various functions and types from the standard C library both in namespace std as well as in the global namespace. Thus, using ::tolower normally works but isn't guaranteed to work.
  2. When including <locale>, there are two versions of std::tolower available, one as int(*)(int) and one as char(*)(char, std::locale const&). When using just std::tolower the compiler has generally no way to decide which one to use.
  3. Since std::tolower is ambiguous, using static_cast<int(*)(int)>(std::tolower) disambiguates which version to use. Why use of static_cast<...>() with VC++ fails, I don't know.
  4. You shouldn't use std::tolower() with a sequences of chars anyway as it will result in undefined behavior. Use a function object using std::tolower internally on an unsigned char.

It is worth noting that using a function object rather than a function pointer is typically a lot faster because it is trivial to inline the function object but not as trivial to inline the function pointer. Compilers are getting better with inlining the use of function pointers where the function is actually known but contemporary compilers certainly don't always inline function calls through function pointers even if all the context would be there.

查看更多
【Aperson】
3楼-- · 2020-03-03 06:10

std::tolower is overloaded in C++, it's declared in <cctype> as

int tolower(int);

and also in <locale> as

template<CharT> CharT tolower(CharT, const locale&);

so when you say "std::tolower" you get an ambiguous reference to an overloaded function.

  1. Why ::tolower version is working?

When you include <cctype> the one-argument overload is declared in namespace std and might also be declared in the global namespace, depending on the compiler. If you include <ctype.h> then it's guaranteed to be included in the global namespace, and ::tolower will work (although note Dietmar's points about when it's not safe). The two-argument overload from <locale> is never declared in the global namespace, so ::tolower never refers to the two-argument overload.

2. Why std::tolower is not working in std::transform?

See above, it's an overloaded name.

3. What static_cast<int(*)(int)>(std::tolower)) really is trying to do?

It tells the compiler you want the int std::tolower(int) overload, not any other overload of std::tolower.

Why does it work with GCC and not with Visual Studio 2013?

Probably because you didn't include <cctype>, or (less likely) it could be a Visual Studio bug.

4. How could I use std::lower in std::transform with Visual Studio 2013 then?

If you know you only have characters with values between 0 and 127 then you can include <ctype.h> and use ::tolower (because the two-argument version is not declared in the global namespace, only in namespace std) or disambiguate which overload you want with the static cast. An alternative to the cast is to use a local variable:

typedef int (*tolower_type)(int);
tolower_type tl = &std::tolower;
std::transform(b, e, b, tl);

A safer and portable alternative is to use a custom function object (or lambda expression) to call the desired overload safely:

std::transform(b, e, b, [](unsigned char i) { return std::tolower(i); });

This uses std::tolower with an argument, so the compiler can do overload resolution to tell which overload you want to call. The parameter is unsigned char to ensure we never pass a char with a negative value to tolower(int), because that has undefined behaviour.

See http://gcc.gnu.org/onlinedocs/libstdc++/manual/strings.html#strings.string.simple for more details.

查看更多
登录 后发表回答