I am very confused by the name 'unordered_map'. The name suggests that the keys are not ordered at all. But I always thought they are ordered by their hash value. Or is that wrong (because the name implies that they are not ordered)?
Or to put it different: Is this
typedef map<K, V, HashComp<K> > HashMap;
with
template<typename T>
struct HashComp {
bool operator<(const T& v1, const T& v2) const {
return hash<T>()(v1) < hash<T>()(v2);
}
};
the same as
typedef unordered_map<K, V> HashMap;
? (OK, not exactly, STL will complain here because there may be keys k1,k2 and neither k1 < k2 nor k2 < k1. You would need to use multimap
and overwrite the equal-check.)
Or again differently: When I iterate through them, can I assume that the key-list is ordered by their hash value?
In answer to your edited question, no those two snippets are not equivalent at all.
std::map
stores nodes in a tree structure,unordered_map
stores them in a hashtable*.Keys are not stored in order of their "hash value" because they're not stored in any order at all. They are instead stored in "buckets" where each bucket corresponds to a range of hash values. Basically, the implementation goes like this:
Obviously that's a serious simplification and real implementation would be much more advanced (for example, supporting resizing the
buckets
array, maybe using a tree structure instead of linked list for the buckets, and so on), but that should give an idea of how you can't get back the values in any particular order. See wikipedia for more information.* Technically, the internal implementation of
std::map
andunordered_map
are implementation-defined, but the standard requires certain Big-O complexity for operations that implies those internal implementationsAs the name unordered_map suggests, no ordering is specified by the C++0x standard. An unordered_map's apparent ordering will be dependent on whatever is convenient for the actual implementation.
You are right,
unordered_map
is actually hash ordered. Note that most current implementations (pre TR1) call ithash_map
.The IBM C/C++ compiler documentation remarks that if you have an optimal hash function, the number of operations performed during lookup, insertion, and removal of an arbitrary element does not depend on the number of elements in the sequence, so this mean that the order is not so unordered...
Now, what does it mean that it is hash ordered? As an hash should be unpredictable, by definition you can't take any assumption about the order of the elements in the map. This is the reason why it has been renamed in TR1: the old name suggested an order. Now we know that an order is actually used, but you can disregard it as it is unpredictable.
If you want an analogy, look at the RDBMS of your choice.
If you don't specify an ORDER BY clause when performing a query, the results are returned "unordered" - that is, in whatever order the database feels like. The order is not specified, and the system is free to "order" them however it likes in order to get the best performance.
"Unordered" doesn't mean that there isn't a linear sequence somewhere in the implementation. It means "you can't assume anything about the order of these elements".
For example, people often assume that entries will come out of a hash map in the same order they were put in. But they don't, because the entries are unordered.
As for "ordered by their hash value": hash values are generally taken from the full range of integers, but hash maps don't have 2**32 slots in them. The hash value's range will be reduced to the number of slots by taking it modulo the number of slots. Further, as you add entries to a hash map, it might change size to accommodate the new values. This can cause all the previous entries to be re-placed, changing their order.
In an unordered data structure, you can't assume anything about the order of the entries.