When working with dictionaries in Python, this page says that the time complexity of iterating through the element of the dictionary is O(n)
, where n
is the largest size the dictionary has been.
However, I don't think that there is an obvious way to iterate through the elements of a hash table. Can I assume good performance of dict.iteritems()
when iterating through element of a hash table, without too much overhead?
Since dictionaries are used a lot in Python, I assume that this is implemented in some smart way. Still, I need to make sure.
If you look at the notes on Python's dictionary source code, I think the relevant points are the following:
Those methods (iteration and key listing) loop over every potential entry
How many potential entries will there be, as a function of largest size (largest number of keys ever stored in that dictionary)? Look at the following two sections in the same document:
Maximum dictionary load in PyDict_SetItem. Currently set to 2/3
Growth rate upon hitting maximum load. Currently set to *2.
This would suggest that the sparsity of a dictionary is going to be somewhere around 1/3~2/3 (unless growth rate is set to *4, then it's 1/6~2/3). So basically you're going to be checking upto 3 (or 6 if *4) potential entries for every key.
Of course, whether it's 3 entries or 1000, it's still O(n) but 3 seems like a pretty acceptable constant factor.
Incidentally here are the rest of the source & documentation, including that of the DictObject:
http://svn.python.org/projects/python/trunk/Objects/