Mystical restriction on std::binary_search

2019-03-17 06:53发布

问题:

Problem description:
Consider some structure having an std::string name member. For clearness let's suppose that it's a struct Human, representing information about people. Besides the name it can also have many other data members.
Let there be a container std::vector<Human> vec, where the objects are already sorted by name. Also for clearness suppose that all the names are unique.
The problem is: having some string nameToFind find out if there exists an element in the array having such name.

Solution and my progress:
The obvious and natural solution seems to perform a binary search using the std::binary_search function. But there is a problem: the type of the element being searched (std::string) is different from the type of the elements in the container (Human), and std::binary_search needs a rule to compare these elements. I tried to solve this in three ways, described below. First two are provided just to illustrate the evolution of my solution and the problems which I came across. My main question refers to the third one.

Attempt 1: convert std::string to Human.

Write a comparing function:

bool compareHumansByNames( const Human& lhs, const Human& rhs )
{
   return lhs.name < rhs.name;
}

Then add a constructor which constructs a Human object from std::string:

struct Human
{
   Human( const std::string& s );
   //... other methods

   std::string name;
   //... other members
};

and use the binary_search in following form:

std::binary_search( vec.begin(), vec.end(), nameToFind, compareHumansByNames );

Seems working, but turns up two big problems:
First, how to initialize other data members but Human::name, especially in the case when they don't have a default constructor ? setting magic values may lead to creation of an object which is semantically illegal.
Second, we have to declare this constructor as non explicit to allow implicit conversions during the algorithm. The bad consequences of this are well known.
Also, such a temporary Human object will be constructed at each iteration, which can turn out to be quite expensive.

Attempt 2: convert Human to std::string.

We can try to add an operator string () to the Human class which returns it's name, and then use the comparsion for two std::strings. However, this approach is also inconvenient by the following reasons:

First, the code will not compile at once because of the problem discussed here. We will have to work a bit more to make the compiler use the appropriate operator <.
Second, what does mean "convert a Human to string" ? Existence of such conversion can lead to semantically wrong usage of class Human, which is undesirable.

Attempt 3: compare without conversions.

The best solution I got so far is to create a

struct Comparator
{
   bool operator() ( const Human& lhs, const std::string& rhs )
   {
      return lhs.name < rhs;
   }
   bool operator() ( const std::string& lhs, const Human& rhs )
   {
      return lhs < rhs.name;
   }
};

and use binary search as

binary_search( vec.begin(), vec.end(), nameToFind, Comparator() );

This compiles and executes correctly, everything seems to be ok, but here is where the interesting part begins:

Have a look at http://www.sgi.com/tech/stl/binary_search.html. It's said here that "ForwardIterator's value type is the same type as T.". Quite confusing restriction, and my last solution breaks it. Let's see what does the C++ standard say about it:


25.3.3.4 binary_search

template<class ForwardIterator, class T>
bool binary_search(ForwardIterator first, ForwardIterator last,
const T& value);

template<class ForwardIterator, class T, class Compare>
bool binary_search(ForwardIterator first, ForwardIterator last,
const T& value, Compare comp);

Requires: Type T is LessThanComparable (20.1.2).


Nothing is explicitly said about ForwardIterator's type. But, in definition of LessThanComparable given in 20.1.2 it is said about comparsion of two elements of the same type. And here is what I do not understand. Does it indeed mean that the type of the object being searched and the type of the container's objects must be the same, and my solution breaks this restriction ? Or it does not refer to the case when the comp comparator is used, and only is about the case when the default operator < is used for comparsion ? In first case, I'm confused about how to use std::binary_search to solve this without coming across the problems mentioned above.

Thanks in advance for help and finding time to read my question.

Note: I understand that writing a binary search by hand takes no time and will solve the problem instantly, but to avoid re-inventing a wheel I want to use the std::binary_search. Also it's very interesting to me to find out about existence of such restriction according to standard.

回答1:

If your goal is to find if there is a Human with a given name, then the following should work for sure:

const std::string& get_name(const Human& h)
{
    return h.name;
}

...

bool result = std::binary_search(
    boost::make_transform_iterator(v.begin(), &get_name),
    boost::make_transform_iterator(v.end(), &get_name),
    name_to_check_against);


回答2:

[Complete rewrite; disregard the comments]

The wording has been changed from C++03 to C++0x. In the latter, there is no more requirement for T to be less-than comparable, presumably to alleviate this unnecessary restriction.

The new standard only requires that comp(e, value) implies !comp(value, e). So as long as your comparator implements both directions, you should be able to legally search a string as the value with a comparator functor that implements both asymmetric comparisons (i.e. your "Attempt 3").



回答3:

I think what the standard is saying here is that the expression fucntor(a, b) needs to be a valid strict weak ordering, no matter if the algorithm decides to do something like functor(*begin, *(begin + 1)). Therefore, I think your comparator would need to supply an overload of operator()(Human, Human) in order to be conforming.

That said, I think this is one of those things not explicitly allowed by the standard, but for which few or no implementations exist which take advantage of the latitude offered by the standard.



回答4:

I don't see it requiring anywhere in the standard that the types of the values passed to the comparison function (or to the < operator) by binary_search must be the same. So, formally I think it is perfectly fine to use a comparator that works with two differently types values.