Is the order of items in a hash_map/unordered_map

2020-02-15 07:25发布

问题:

Is it guaranteed that when a hash_map/unordered_map is loaded with the same items, they will have the same order when iterated? Basically I have a hashmap which I load from a file, and from which I periodically feed a limited number of items to a routine, after which I free the hashmap. After the items are consumed, I re-load the same file to the hashmap and want to get the next batch of items after the point where I stopped the previous time. The point at which I stop would be identified by the key.

回答1:

Technically no, they are not guaranteed to be in any particular order.

In practice however, given that you use deterministic hash function, what you want to do should be fine.

consider

std::string name;
std::string value;

std::unordered_map <std::string, std::string> map1;
std::unordered_map <std::string, std::string> map2;

while (read_pair (name, value))
{
    map1[name] = value;
    map2[name] = value;
}

you can reasonably expect that name-value pairs in map1 and map2 go in the same order.



回答2:

No, you cannot safely do this. Firstly, it's not guaranteed by the standard, but even if you ignore the standard and look at real implementations, it is a bad idea.

Most hash table structures are not history independent. That is: the state of the hash table depends not only on what items it contains, but also what order they were inserted.

Here is a concrete example:

#include <unordered_map>
#include <string>
#include <iostream>

static const char* const NAMES[] = {
    "joe",
    "bob",
    "alexander",
    "warren",
    "paul",
    "michael",
    "george",
    "david",
    "peter"
};
static const int NAME_COUNT = sizeof(NAMES)/sizeof(NAMES[0]);

static void print_umap(const std::unordered_map<std::string, int>& m) {
    for (const auto& item : m) {
        std::cout << "  " << item.first << "\n";
    }
}

int main(void) {
    std::unordered_map<std::string, int> a;
    std::unordered_map<std::string, int> b;
    std::unordered_map<std::string, int> c;

    for (int i = 0; i < NAME_COUNT; ++i) {
        a[NAMES[i]] = 0;
        b[NAMES[NAME_COUNT - 1 - i]] = 0;
    }

    for (const auto& item : a) {
        c[item.first] = 0;
    }

    std::cout << "a:\n";
    print_umap(a);
    std::cout << "\n\nb:\n";
    print_umap(b);
    std::cout << "\n\nc:\n";
    print_umap(c);
    return 0;
}

When I build this using clang and the libc++ implementation of the C++ standard library, I get the following output:

a:
  peter
  george
  michael
  david
  paul
  bob
  warren
  alexander
  joe


b:
  joe
  alexander
  bob
  warren
  david
  paul
  michael
  george
  peter


c:
  joe
  alexander
  warren
  bob
  paul
  david
  michael
  george
  peter

Note that the order is different in every case. This is not at all unusual for hash tables.