In Tim Peter's answer to "Are there any reasons not to use an ordered dictionary", he says
OrderedDict is a subclass of dict.
It's not a lot slower, but at least doubles the memory over using a plain dict.
Now, while going through a particular question, I tried some sample checks using ipython
and both of them contradict the earlier reasoning:
- both
dict
andOrderedDict
are of same size - operating on an
OrderedDict
takes easily around 7-8 times more time than operating on adict
(Hence a lot slower)
Can someone explain to me where I'm going wrong in my reasoning?
Create a large Dict and OrderedDict and compare sizes:
import sys
import random
from collections import OrderedDict
test_dict = {}
test_ordered_dict = OrderedDict()
for key in range(10000):
test_dict[key] = random.random()
test_ordered_dict[key] = random.random()
sys.getsizeof(test_dict)
786712
sys.getsizeof(test_ordered_dict)
786712
Check time taken for the insertions using %timeit
:
import sys
import random
from collections import OrderedDict
def operate_on_dict(r):
test_dict = {}
for key in range(r):
test_dict[key] = random.random()
def operate_on_ordered_dict(r):
test_ordered_dict = OrderedDict()
for key in range(r):
test_ordered_dict[key] = random.random()
%timeit for x in range(100): operate_on_ordered_dict(100)
100 loops, best of 3: 9.24 ms per loop
%timeit for x in range(100): operate_on_dict(100)
1000 loops, best of 3: 1.23 ms per loop
I think the problem with size is due to the fact that there's no
__sizeof__
method defined in Python 2.X implementation ofOrderedDict
, so it simply falls back to dict's__sizeof__
method.To prove this here I've created a class
A
here which extendslist
and also added an additional methodfoo
to check if that affects the size.But still same size is returned by
sys.getsizeof
:Of course
A
is going to be slow because its methods are running in Python while list's method will run in pure C.And this seems to be fixed in Python 3 where there's a well defined
__sizeof__
method now: