Lets say I have the following code:
import collections
d = collections.OrderedDict()
d['foo'] = 'python'
d['bar'] = 'spam'
Is there a way I can access the items in a numbered manner, like:
d(0) #foo's Output
d(1) #bar's Output
Lets say I have the following code:
import collections
d = collections.OrderedDict()
d['foo'] = 'python'
d['bar'] = 'spam'
Is there a way I can access the items in a numbered manner, like:
d(0) #foo's Output
d(1) #bar's Output
If its an OrderedDict()
you can easily access the elements by indexing by getting the tuples of (key,value) pairs as follows
>>> import collections
>>> d = collections.OrderedDict()
>>> d['foo'] = 'python'
>>> d['bar'] = 'spam'
>>> d.items()
[('foo', 'python'), ('bar', 'spam')]
>>> d.items()[0]
('foo', 'python')
>>> d.items()[1]
('bar', 'spam')
Note for Python 3.X
dict.items
would return an iterable dict view object rather than a list. We need to wrap the call onto a list in order to make the indexing possible
>>> items = list(d.items())
>>> items
[('foo', 'python'), ('bar', 'spam')]
>>> items[0]
('foo', 'python')
>>> items[1]
('bar', 'spam')
Do you have to use an OrderedDict or do you specifically want a map-like type that's ordered in some way with fast positional indexing? If the latter, then consider one of Python's many sorted dict types (which orders key-value pairs based on key sort order). Some implementations also support fast indexing. For example, the sortedcontainers project has a SortedDict type for just this purpose.
>>> from sortedcontainers import SortedDict
>>> sd = SortedDict()
>>> sd['foo'] = 'python'
>>> sd['bar'] = 'spam'
>>> print sd.iloc[0] # Note that 'bar' comes before 'foo' in sort order.
'bar'
>>> # If you want the value, then simple do a key lookup:
>>> print sd[sd.iloc[1]]
'python'
Here is a special case if you want the first entry (or close to it) in an OrderedDict, without creating a list:
>>> from collections import OrderedDict
>>>
>>> d = OrderedDict()
>>> d["foo"] = "one"
>>> d["bar"] = "two"
>>> d["baz"] = "three"
>>>
>>> d.iteritems().next()
('foo', 'one')
(The first time you say "next()", it really means "first.")
In my informal test in Python 2.7, iteritems().next()
with a small OrderedDict is only a tiny bit faster than items()[0]
. With an OrderedDict of 10,000 entries, iteritems().next()
was about 200 times faster than items()[0]
.
BUT if you save the items() list once and then use the list a lot, that could be faster. Or if you repeatedly { create an iteritems() iterator and step through it to to the position you want }, that could be slower.
It is dramatically more efficient to use IndexedOrderedDict from the indexed
package.
Following Niklas's comment, I have done a benchmark on OrderedDict and IndexedOrderedDict with 1000 entries.
In [1]: from numpy import *
In [2]: from indexed import IndexedOrderedDict
In [3]: id=IndexedOrderedDict(zip(arange(1000),random.random(1000)))
In [4]: timeit id.keys()[56]
1000000 loops, best of 3: 969 ns per loop
In [8]: from collections import OrderedDict
In [9]: od=OrderedDict(zip(arange(1000),random.random(1000)))
In [10]: timeit od.keys()[56]
10000 loops, best of 3: 104 µs per loop
IndexedOrderedDict is ~100 times faster in indexing elements at specific position in this specific case.
This community wiki attempts to collect existing answers.
Python 2.7
In python 2, the keys()
, values()
, and items()
functions of OrderedDict
return lists. Using values
as an example, the simplest way is
d.values()[0] # "python"
d.values()[1] # "spam"
For large collections where you only care about a single index, you can avoid creating the full list using the generator versions, iterkeys
, itervalues
and iteritems
:
import itertools
next(itertools.islice(d.itervalues(), 0, 1)) # "python"
next(itertools.islice(d.itervalues(), 1, 2)) # "spam"
The indexed.py package provides IndexedOrderedDict
, which is designed for this use case and will be the fastest option.
from indexed import IndexedOrderedDict
d = IndexedOrderedDict({'foo':'python','bar':'spam'})
d.values()[0] # "python"
d.values()[1] # "spam"
Using itervalues can be considerably faster for large dictionaries with random access:
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 1000; d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i:i+1]'
1000 loops, best of 3: 259 usec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 10000; d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i:i+1]'
100 loops, best of 3: 2.3 msec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 100000; d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i:i+1]'
10 loops, best of 3: 24.5 msec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 1000; d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); next(itertools.islice(d.itervalues(), i, i+1))'
10000 loops, best of 3: 118 usec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 10000; d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); next(itertools.islice(d.itervalues(), i, i+1))'
1000 loops, best of 3: 1.26 msec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 100000; d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); next(itertools.islice(d.itervalues(), i, i+1))'
100 loops, best of 3: 10.9 msec per loop
$ python2 -m timeit -s 'from indexed import IndexedOrderedDict; from random import randint; size = 1000; d = IndexedOrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i]'
100000 loops, best of 3: 2.19 usec per loop
$ python2 -m timeit -s 'from indexed import IndexedOrderedDict; from random import randint; size = 10000; d = IndexedOrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i]'
100000 loops, best of 3: 2.24 usec per loop
$ python2 -m timeit -s 'from indexed import IndexedOrderedDict; from random import randint; size = 100000; d = IndexedOrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i]'
100000 loops, best of 3: 2.61 usec per loop
+--------+-----------+----------------+---------+
| size | list (ms) | generator (ms) | indexed |
+--------+-----------+----------------+---------+
| 1000 | .259 | .118 | .00219 |
| 10000 | 2.3 | 1.26 | .00224 |
| 100000 | 24.5 | 10.9 | .00261 |
+--------+-----------+----------------+---------+
Python 3.6
Python 3 has the same two basic options (list vs generator), but the dict methods return generators by default.
List method:
list(d.values())[0] # "python"
list(d.values())[1] # "spam"
Generator method:
import itertools
next(itertools.islice(d.values(), 0, 1)) # "python"
next(itertools.islice(d.values(), 1, 2)) # "spam"
Python 3 dictionaries are an order of magnitude faster than python 2 and have similar speedups for using generators.
+--------+-----------+----------------+---------+
| size | list (ms) | generator (ms) | indexed |
+--------+-----------+----------------+---------+
| 1000 | .0316 | .0165 | .00262 |
| 10000 | .288 | .166 | .00294 |
| 100000 | 3.53 | 1.48 | .00332 |
+--------+-----------+----------------+---------+
It's a new era and with Python 3.6.1 dictionaries now retain their order. These semantics aren't explicit because that would require BDFL approval. But Raymond Hettinger is the next best thing (and funnier) and he makes a pretty strong case that dictionaries will be ordered for a very long time.
So now it's easy to create slices of a dictionary:
test_dict = {
'first': 1,
'second': 2,
'third': 3,
'fourth': 4
}
list(test_dict.items())[:2]
Note: Dictonary insertion-order preservation is now official in Python 3.7.