I'll start by illustrating a simple use case example:
Consider the problem of a social security ID database, where in C++ code is modelled as a
std::unordered_map
where its key is the social security ID of a person and its value is astd::string
with the full-name of that person (e.g.,std::unordered_map<int, std::string> DB;
).Consider also, that there's a request for printing this database sorted in ascending order based on the person's ID (i.e.,
std::unordered_map
's key).Naively, one would think to use
std::sort
in order to sort thestd::unordered_map
according to the requested criteria and then print it, like the example code below:
std::sort(DB.begin(), DB.end());
for(auto p : DB) std::cout << "ID(" << p.first
<< ") - "
<< p.second
<< std::endl;
- However, this is not the case, because use of
std::sort
with a range of either astd::unordered_map
or astd::unordered_set
will raise a compiler error.
Questions:
- Why STL's unordered containers cannot be sorted by
std::sort
? - Is there a legitimate and efficient way to sort either a
std::unordered_map
or astd::unordered_set
?
Because unordered containers are already "sorted", albeit not directly by their keys, but by (typically)
hash_function
(key) %
bucket_count
()
(also accessible asbucket
(key)
). This "sort" order isn't cosmetic - it's the whole basis on which hash tables are able to find elements quickly. Ifstd::sort
were allowed to re-order the elements by key instead, then the container would no longer be able to function as a hash table: elements couldn't be reliably found or erased, insertions might put duplicates in the container etc..In the general case, only by first copying the elements to a sortable or sorted container such as
std::vector
orstd::set
(the former will usually be faster, but benchmark both if you really care):In your case with
std::unordered_map<int, std::string> DB;
, I'd suggest copying only theint
keys to avector
for sorting, then during iteration look up each key in theunordered_map
: that will avoid a lot ofstring
copying.(It is sometimes possible to orchestrate an unordered container with ordering by key (e.g. hash function returns key, container presized so max bucket index >= max key value) but anyone considering such abuse would be better off using a
vector
.)unordered
containers store internally hashed data and thus it's not possible to order them after the hash has been generated.In order to sort the data you can use an additional non-hashed container (e.g. map or set) and either use them along with the unordered version (so you can use the normal one to sort the data and the unordered one to have fast per-item access) or you can do something like
I recommend not to do the above often (unordered containers have slow sequential access)
https://stackoverflow.com/a/6212709/1938163
Sorting only makes sense for sequence containers, which are containers whose elements are determined by the order in which they were added to the container. The dynamic sequence containers in the standard library are vector, deque, list and forward_list.
Maps and sets, on the other hand, are associative containers, in which elements are identified by their value. Thus it makes no sense to ask for an "ordering", since the container elements aren't arranged in any kind of sequence. (It's true that an ordered map can be iterated in a comparison order on the key, but that order emerges from the container; it is not provided by the user.)