How expensive is comparing two unordered sets for

2019-06-16 06:44发布

问题:

Given two std::sets, one can simply iterate through both sets simultaneously and compare the elements, resulting in linear complexity. This doesn't work for std::unordered_sets, because the elements may be stored in any order. So how expensive is a == b for std::unordered_set?

回答1:

Complexity of operator== and operator!=:

Linear complexity in the average case. N2 in the worst case, where N is the size of the container.

More details in the standard §23.2.5, point 11:

For unordered_set and unordered_map, the complexity of operator== (i.e., the number of calls to the == operator of the value_type, to the predicate returned by key_equal(), and to the hasher returned by hash_function()) is proportional to N in the average case and to N2 in the worst case, where N is a.size().



回答2:

The very worst case is O(n²).

But unordered sets are in fact ordered by hash. So it is possible to compare the hashes (if this fails the sets cannot be equal) and then verify that same hashes (linear) have real same values (O(n²) for different values with the same hash) behind.

In the best case this is O(n).

Normally the complexity tends to O(n) if the hash function is "good" (different objects -> always different hash) and to O(n²) if the hash function is "bad" (everything always has the same hash value)