Avoiding copies when using structs in STL sets and

2019-07-04 21:25发布

Imagine you have a struct with a bunch of members, and you want to use a particular value referenced via one of its members as the key in a set, like so:

class ComplexClass {
 public:
  const string& name() const;
  // tons of other stuff
};
struct MyStruct {
  ComplexClass* c;
  MoreStuff* x;
};
struct CmpMyStruct {
  bool operator()(const MyStruct& lhs, const MyStruct& rhs) {
    return lhs.c->name() < rhs.c->name();
  }
};
typedef set<MyStruct, CmpMyStruct> MySet;
MySet my_set;

This works just fine, however now I'd like to do a lookup by a string name, but my_set.find() now takes of course a 'const MyStruct&'. If the name wasn't taken out of that ComplexClass but was instead a member of MyStruct, I could just quickly fake up an instance of MyStruct and use that:

MyStruct tmp_for_lookup;
tmp_for_lookup.name = "name_to_search";  // Doesn't work of course
MySet::iterator iter =  my_set.find(tmp_for_lookup);

However, as said, that's not how it works, the name is in ComplexClass, so I'd have to at least put a mock of that in there or something.

So what I'd actually want is that the STL set wouldn't compare MyStructs but rather first "project" the key out of the MyStruct (which has type string), and then do its operations, including find(), on that. I started digging into the implementation of set/map in gcc to see how they solved that problem for the map, and was saddened to see that they actually solved it in the internal _Rb_tree, but didn't expose it, since it's not part of the standard. From gcc's stl_tree.h:

template<typename _Key, typename _Val, typename _KeyOfValue,
         typename _Compare, typename _Alloc = allocator<_Val> >
  class _Rb_tree
  {
....
template<typename _Key, typename _Val, typename _KeyOfValue,
         typename _Compare, typename _Alloc>
  typename _Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::iterator
  _Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::
  find(const _Key& __k)

And then in the stl_map.h:

    typedef _Rb_tree<key_type, value_type, _Select1st<value_type>,
             key_compare, _Pair_alloc_type> _Rep_type;

Note how it uses '_Select1st' to project the key out of the value_type, so find() can actually just work with the key. On the other hand the stl_set.h just uses the Identity in that case, as expected.

So I was wondering, is there a way that I'm currently missing for how I could achieve the same beauty & efficiency with the normal STL sets/maps (i.e. I absolutely don't want to use the GCC-specific _Rb_tree directly), such that I could really just do

MySet::iterator iter = my_set.find("somestring");

Note that I specifically don't want to change my_set to be a map from strings to MyStructs, i.e. I do not want to copy the string (or a reference to it) out of the ComplexClass just so I could do map<string, MyStruct> or map<const string&, MyStruct> instead.

This is almost more of a thought-exercise at this point, but seemed interesting :)

标签: c++ stl set
5条回答
\"骚年 ilove
2楼-- · 2019-07-04 21:52

AFAIK, there's no way with the C++ standard library's std::set only, but you can use std::find_if with a predicate

struct cmp_for_lookup {
    std::string search_for;
    cmp_for_lookup(const std::string &s) : search_for(s) {}
    bool operator()(const MyStruct &elem) { return elem.c->name() == search_for; }
};

std::find_if(my_set.begin(), my_set.end(), cmp_for_lookup("name_to_search"));
查看更多
等我变得足够好
3楼-- · 2019-07-04 21:53

now I'd like to do a lookup by a string name, but my_set.find() now takes of course a 'const MyStruct&'.

This is a well known deficiency of the interface of std::set.

boost::multi_index with ordered_unique index provides the interface of std::set with extra find() functions that take a comparable key, rather than a whole set element.

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>

struct ComplexClass {
    std::string name() const;

    struct KeyName {
        typedef std::string result_type;
        std::string const& operator()(ComplexClass const& c) const {
            return c.name();
        }
    };
};

namespace mi = boost::multi_index;

typedef mi::multi_index_container<
      ComplexClass
    , mi::indexed_by<
            mi::ordered_unique<ComplexClass::KeyName>
          >
    > ComplexClassSet;

int main() {
    ComplexClassSet s;
    // fill the set
    // ...
    // now search by name
    ComplexClassSet::iterator found = s.find("abc");
    if(found != s.end()) {
        // found an element whose name() == "abc"
    }
}

See http://www.boost.org/doc/libs/1_52_0/libs/multi_index/doc/tutorial/key_extraction.html for more details.

查看更多
欢心
4楼-- · 2019-07-04 21:54

Obviously, since boost is out of the question, and std::set is not intended to provide multiple indexes, you're bound to build your own extra index.

std::string name_of_struct(const MyStruct* s){
   return s->c->name();
}


...
std::map<std::string, MyStruct*> indexByName;
std::transform( my_set.begin(), my_set.end(), 
                std::inserter(indexByName,indexByName.begin()),
                &name_of_struct );

 ...
 MyStruct* theFoundStruct=indexByName("theKey");

(note: the idea will work, but the compiler will still complain since this is top-of-my-head).

查看更多
孤傲高冷的网名
5楼-- · 2019-07-04 21:56

Remember that the keys in an associative container (set or map) are const, because they are sorted as they are inserted and cannot be re-sorted later. So any solution that actually references the inserted object will be vulnerable to the possibility that you change the key member after inserting.

So, the most general solution is to use a map and copy the key member.

If the keys won't change, then you can use pointers as keys. Lookup using a temporary will require getting a pointer to temporary, but that's OK as find and operator[] won't keep a copy of their argument.

#include <map>
#include <string>

/* Comparison functor for maps that don't own keys. */
struct referent_compare {
    template< typename t >
    bool operator () ( t *lhs, t *rhs )
        { return * lhs < * rhs; }
};

/* Convenience function to get a const pointer to a temporary. */
template< typename t >
t const *temp_ptr( t const &o ) { return &o; }

struct s {
    std::string name;
    long birthdate;
    double score;
};

/* Maps that don't own keys.
   Generalizing this to a template adapter left as an exercise. */
std::map< std::string const *, s, referent_compare > byname;
std::map< double const *, s, referent_compare > byscore;

int main() {
    s bob = { "bob", 12000000, 96.3 };
    byname.insert( std::make_pair( & bob.name, bob ) );
    byscore.insert( std::make_pair( & bob.score, bob ) );

    byname[ temp_ptr< std::string >( "bob" ) ].score = 33.1;
}

Of course, another solution is to use a set of pointers and define the comparison function to access the member. Pointer-to-members can be used to generalize that so the key member is merely a template parameter to the comparison functor.

http://ideone.com/GB2jfn

查看更多
Explosion°爆炸
6楼-- · 2019-07-04 22:11

If you can handle the overhead of a virtual call on comparisson, you could use a trick like this:

class ComplexClass {
 public:
  const string& name() const;
  // tons of other stuff
};
struct MyStruct {
  ComplexClass* c;
  MoreStuff* x;
  virtual const string& key() const { return c->name(); } /* change */
};
struct CmpMyStruct {
  bool operator()(const MyStruct& lhs, const MyStruct& rhs) {
    return lhs.key() < rhs.key(); /* change */
  }
};
typedef set<MyStruct, CmpMyStruct> MySet;
MySet my_set;

Then, for lookup, add the following struct:

struct MyLookupStruct : MyStruct {
  string search_key;
  explicit MyLookupStruct(const string& key) : search_key(key) {}
  virtual const string& key() const { return search_key; }
};
/* .... */
MySet::iterator iter =  my_set.find(MyLookupStruct("name to find"));

This depends on std::set<>::find not making a copy of the argument, which seems a reasonable assumption (but to my knowledge not explicitly guaranteed).

查看更多
登录 后发表回答