I have a list of dicts like this:
sales_per_store_per_day = [
{'date':'2014-06-01', 'store':'a', 'product1':10, 'product2':3, 'product3':15},
{'date':'2014-06-01', 'store':'b', 'product1':20, 'product2':4, 'product3':16},
{'date':'2014-06-02', 'store':'a', 'product1':30, 'product2':5, 'product3':17},
{'date':'2014-06-02', 'store':'b', 'product1':40, 'product2':6, 'product3':18},
]
How could I reduce this list to have a sum of products for each store, ignoring the date? The result for the above input would be:
sales_per_store = [
{'store':'a', 'product1':40, 'product2':8, 'product3':32},
{'store':'b', 'product1':60, 'product2':10, 'product3':34}
]
Use a collections.defaultdict()
to track info per store, and collections.Counter()
to ease summing of the numbers:
from collections import defaultdict, Counter
by_store = defaultdict(Counter)
for info in sales_per_store_per_day:
counts = Counter({k: v for k, v in info.items() if k not in ('store', 'date')})
by_store[info['store']] += counts
sales_per_store = [dict(v, store=k) for k, v in by_store.items()]
counts
is a Counter()
instance built from each of the products in the info
dictionary; I'm assuming that everything except the store
and date
keys are product counts. It uses a dict comprehension to produce a copy with those two keys removed. The by_store[info['store']]
looks up the current total counts for the given store (which default to a new, empty Counter()
object).
The last line then produces your desired output; new dictionaries with 'store'
and per-product counts, but you may want to just keep the dictionary mapping from store to Counter
objects.
Demo:
>>> from collections import defaultdict, Counter
>>> sales_per_store_per_day = [
... {'date':'2014-06-01', 'store':'a', 'product1':10, 'product2':3, 'product3':15},
... {'date':'2014-06-01', 'store':'b', 'product1':20, 'product2':4, 'product3':16},
... {'date':'2014-06-02', 'store':'a', 'product1':30, 'product2':5, 'product3':17},
... {'date':'2014-06-02', 'store':'b', 'product1':40, 'product2':6, 'product3':18},
... ]
>>> by_store = defaultdict(Counter)
>>> for info in sales_per_store_per_day:
... counts = Counter({k: v for k, v in info.items() if k not in ('store', 'date')})
... by_store[info['store']] += counts
...
>>> [dict(v, store=k) for k, v in by_store.items()]
[{'store': 'a', 'product3': 32, 'product2': 8, 'product1': 40}, {'store': 'b', 'product3': 34, 'product2': 10, 'product1': 60}]
Version without collections
- maybe more readable for begginers.
sales_per_store_per_day = [
{'date':'2014-06-01', 'store':'a', 'product1':10, 'product2':3, 'product3':15},
{'date':'2014-06-01', 'store':'b', 'product1':20, 'product2':4, 'product3':16},
{'date':'2014-06-02', 'store':'a', 'product1':30, 'product2':5, 'product3':17},
{'date':'2014-06-02', 'store':'b', 'product1':40, 'product2':6, 'product3':18},
]
results = {}
for x in sales_per_store_per_day:
# default value
if x['store'] not in results:
results[x['store']] = {'store': x['store'], 'product1':0, 'product2':0, 'product3':0}
results[x['store']]['product1'] += x['product1']
results[x['store']]['product2'] += x['product2']
results[x['store']]['product3'] += x['product3']
print results
sales_per_store = results.values()
print sales_per_store
.
# results
{
'a': {'product3': 32, 'product1': 40, 'store': 'a', 'product2': 8},
'b': {'product3': 34, 'product1': 60, 'store': 'b', 'product2': 10}
}
# sales_per_store
[
{'product3': 32, 'product1': 40, 'store': 'a', 'product2': 8},
{'product3': 34, 'product1': 60, 'store': 'b', 'product2': 10}
]