Given a dictionary of lists, such as
d = {'1':[11,12], '2':[21,21]}
Which is more pythonic or otherwise preferable:
for k in d:
for x in d[k]:
# whatever with k, x
or
for k, dk in d.iteritems():
for x in dk:
# whatever with k, x
or is there something else to consider?
EDIT, in case a list might be useful (e.g., standard dicts don't preserve order), this might be appropriate, although it's much slower.
d2 = d.items()
for k in d2:
for x in d2[1]:
# whatever with k, x
Here's a speed test, why not:
import random
numEntries = 1000000
d = dict(zip(range(numEntries), [random.sample(range(0, 100), 2) for x in range(numEntries)]))
def m1(d):
for k in d:
for x in d[k]:
pass
def m2(d):
for k, dk in d.iteritems():
for x in dk:
pass
import cProfile
cProfile.run('m1(d)')
print
cProfile.run('m2(d)')
# Ran 3 trials:
# m1: 0.205, 0.194, 0.193: average 0.197 s
# m2: 0.176, 0.166, 0.173: average 0.172 s
# Method 1 takes 15% more time than method 2
cProfile example output:
3 function calls in 0.194 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.194 0.194 <string>:1(<module>)
1 0.194 0.194 0.194 0.194 stackoverflow.py:7(m1)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
4 function calls in 0.179 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.179 0.179 <string>:1(<module>)
1 0.179 0.179 0.179 0.179 stackoverflow.py:12(m2)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 0.000 0.000 {method 'iteritems' of 'dict' objects}
I considered a couple methods:
import itertools
COLORED_THINGS = {'blue': ['sky', 'jeans', 'powerline insert mode'],
'yellow': ['sun', 'banana', 'phone book/monitor stand'],
'red': ['blood', 'tomato', 'test failure']}
def forloops():
""" Nested for loops. """
for color, things in COLORED_THINGS.items():
for thing in things:
pass
def iterator():
""" Use itertools and list comprehension to construct iterator. """
for color, thing in (
itertools.chain.from_iterable(
[itertools.product((k,), v) for k, v in COLORED_THINGS.items()])):
pass
def iterator_gen():
""" Use itertools and generator to construct iterator. """
for color, thing in (
itertools.chain.from_iterable(
(itertools.product((k,), v) for k, v in COLORED_THINGS.items()))):
pass
I used ipython and memory_profiler to test performance:
>>> %timeit forloops()
1000000 loops, best of 3: 1.31 µs per loop
>>> %timeit iterator()
100000 loops, best of 3: 3.58 µs per loop
>>> %timeit iterator_gen()
100000 loops, best of 3: 3.91 µs per loop
>>> %memit -r 1000 forloops()
peak memory: 35.79 MiB, increment: 0.02 MiB
>>> %memit -r 1000 iterator()
peak memory: 35.79 MiB, increment: 0.00 MiB
>>> %memit -r 1000 iterator_gen()
peak memory: 35.79 MiB, increment: 0.00 MiB
As you can see, the method had no observable impact on peak memory usage, but nested for
loops were unbeatable for speed (not to mention readability).
Here's the list comprehension approach. Nested...
r = [[i for i in d[x]] for x in d.keys()]
print r
[[11, 12], [21, 21]]
My results from Brionius code:
3 function calls in 0.173 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.173 0.173 <string>:1(<module>)
1 0.173 0.173 0.173 0.173 speed.py:5(m1)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Prof
iler' objects}
4 function calls in 0.185 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.185 0.185 <string>:1(<module>)
1 0.185 0.185 0.185 0.185 speed.py:10(m2)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Prof
iler' objects}
1 0.000 0.000 0.000 0.000 {method 'iteritems' of 'dict' obje
cts}