Is there a flat unsorted map/set implementation?

2019-06-23 17:26发布

问题:

There is the boost.container flat_map and others, and the Loki AssocVector and many others like these which keep the elements sorted.

Is there a modern (c++11 move-enabled, etc.) implementation of an unsorted vector adapted as a map/set?

The idea is to use it for very small maps/sets (less than 20 elements) and with simple keys (for which hashing wouldn't always make sense)

回答1:

Something like this?

template<class Key, class Value, template<class...>class Storage=std::vector>
struct flat_map {
  struct kv {
    Key k;
    Value v;
    template<class K, class V>
    kv( K&& kin, V&& vin ):k(std::forward<K>(kin)), v(std::forward<V>(vin)){}
  };
  using storage_t = Storage<kv>;
  storage_t storage;

  // TODO: adl upgrade
  using iterator=decltype(std::begin(std::declval<storage_t&>()));
  using const_iterator=decltype(std::begin(std::declval<const storage_t&>()));
  // boilerplate:
  iterator begin() {
    using std::begin;
    return begin(storage);
  }
  const_iterator begin() const {
    using std::begin;
    return begin(storage);
  }
  const_iterator cbegin() const {
    using std::begin;
    return begin(storage);
  }
  iterator end() {
    using std::end;
    return end(storage);
  }
  const_iterator end() const {
    using std::end;
    return end(storage);
  }
  const_iterator cend() const {
    using std::end;
    return end(storage);
  }
  size_t size() const {
    return storage.size();
  }
  bool empty() const {
    return storage.empty();
  }
  // these only have to be valid if called:
  void reserve(size_t n) {
    storage.reserve(n);
  }
  size_t capacity() const {
    return storage.capacity();
  }
  // map-like interface:
  // TODO: SFINAE check for type of key
  template<class K>
  Value& operator[](K&& k){
    auto it = find(k);
    if (it != end()) return it->v;
    storage.emplace_back( std::forward<K>(k), Value{} );
    return storage.back().v;
  }
private: // C++14, but you can just inject the lambda at point of use in 11:
  template<class K>
  auto key_match( K& k ) {
    return [&k](kv const& kv){
      return kv.k == k;
    };
  }
public:
  template<class K>
  iterator find(K&& k) {
    return std::find_if( begin(), end(), key_match(k) );
  }
  template<class K>
  const_iterator find(K&& k) const {
    return const_cast<flat_map*>(this)->find(k);
  }
  // iterator-less query functions:
  template<class K>
  Value* get(K&& k) {
    auto it = find(std::forward<K>(k));
    if (it==end()) return nullptr;
    return std::addressof(it->v);
  }
  template<class K>
  Value const* get(K&& k) const {
    return const_cast<flat_map*>(this)->get(std::forward<K>(k));
  }
  // key-based erase: (SFINAE should be is_comparible, but that doesn't exist)
  template<class K, class=std::enable_if_t<std::is_converible<K, Key>{}>>
  bool erase(K&& k) {
    auto it = std::remove(
      storage.begin(), storage.end(), key_match(std::forward<K>(k))
    );
    if (it == storage.end()) return false;
    storage.erase( it, storage.end() );
    return true;
  }
  // classic erase, for iterating:
  iterator erase(const_iterator it) {
    return storage.erase(it);
  }
  template<class K2, class V2,
    class=std::enable_if_t<
      std::is_convertible< K2, Key >{}&&
      std::is_convertible< V2, Value >{}
    >
  >
  void set( K2&& kin, V2&& vin ) {
    auto it = find(kin);
    if (it != end()){
      it->second = std::forward<V2>(vin);
      return;
    } else {
      storage.emplace_back( std::forward<K2>(kin), std::forward<V2>(vin) );
    }
  }
};

I left the container type as a template argument, so you can use a SBO vector-like structure if you choose.

In theory, I should expose a template parameter for replacing equals on the keys. I did, however, make the key-search functions transparent.



回答2:

If the sets are sure to be small then you can just use a std::vector (or std::deque) and do lookup using linear searches. An O(n) linear search over a small vector can be faster than an O(log(n)) search over a more complicated structure such as a red-black tree.

So you could just put elements in a vector without sorting them. You would still need to do some shuffling if you remove elements, but that will always be true for a flat vector-like structure whether it's sorted or not, unless you only ever remove the back element. Why is it important that it's flat anyway?



回答3:

There's std::unordered_set and std::unordered_map but as far as I know they are not implemented using vectors.

A possible option is to write your own hash vector and hash the key using std::hash<Key> and then index the resulting number modulo the length of the vector, but then you'll have to figure out a way to handle collisions and all the resulting problems manually. Not sure I recommended that.

An alternative would be to pass a custom allocator to std::unordered_set and std::unordered_map which perform the allocation on a vector (for example by owning an internal vector), as suggested by @BeyelerStudios.



回答4:

Evgeny Panasyuk is correct, I believe what you want is an Open Address Hash Map.
This fits exactly your requirement, only 1 flat buffer, no allocation of nodes, no pointers to follow, and unsorted.

Otherwise you also have flat_map/AssocVectorbut they are sorted, unlike your requirement.

For OAHM, I have an implementation of a STL-like generic one here:
https://sourceforge.net/projects/cgenericopenaddresshashmap/

Also you might want to take a look the benchmark page of flat_map:
boost::flat_map and its performance compared to map and unordered_map
The OAHM is performing very close to the flat_map in all tests, except iteration.