Will a Python dict with integers as keys be natura

2019-01-27 13:46发布

问题:

If I create a Python dict which uses integers as keys, can I safely assume that iterating over the dict will retrieve items in order according to key value?

i.e. will

my_dict = {}
for x in range(0,100):
  my_dict[x] = str(x)

for item in my_dict.items():
  print item

always result in printing the list in order of key value?

回答1:

In short, no. I'm betting you noted that dictionaries use the hashes of keys as indexes in to an array, and since ints hash to their own values, you inferred that inserted values would end up in order by key if their keys are integers. While the first 2 parts of that statement are true, the inference is not, even as an undocumented side effect. The dict keys are derived from the hashes of the keys, but are not the complete hashes. This means even with integer keys, you can still get out of order inserts since 2 values could collide at the same location (or even have "out of order" hash-derived values) and thus end up inserting the keys out of order in the dict.

Basically, think of it as the index in the internal storage array of the dict being some number of low order bits from the key's hash. Just because one number is larger than another doesn't mean that a value built from it's truncated low order bits is going to be larger, or even different.



回答2:

No, Python dictionaries do not have inherent ordering, regardless of the key values. If you need ordering, stick to arrays or lists, or better yet - check out pandas, which will allow a similar ability to dictionaries to call by key value, as well as many other powerful features (http://pandas.pydata.org/pandas-docs/stable/10min.html).



回答3:

No, you cannot. Always sort if you want to iterate in an ordered fashion.



回答4:

I don't think so. You have to make use of collections.OrderedDict in order to ensure ordering. However, this will sort the entries in the order they were added.



回答5:

Python dictionaries are not ordered in any meaningful way; they are hash tables.

Python comes with collections.OrderedDict, but this sorts in order of insertion, not order of key.

Here are two dictionary-like modules that sort by keys:

https://pypi.python.org/pypi/treap/

https://pypi.python.org/pypi/red-black-tree-mod/

Some say that treaps are faster on average than red-black trees but red-black trees have a lower standard deviation in operation times. Others question this, though in my tests the former proved true.

Both treaps and red-black trees do almost everything in O(logn) time, but keep their keys in order constantly. Python dictionaries are O(1) for most operations. However, getting all keys in order is O(n) for treaps and red-black trees, while it's O(nlogn) for dictionaries.

When should you use which?

  1. If you're sorting in a loop, you're probably better off with a treap or red-black tree.
  2. If you're sorting once at the end of your program or something, you're probably better off with list_ = list(dict_); list_.sort()
  3. If you're preserving the order of your inputs, like from a config file or something, you're probably best off with OrderedDict.

HTH