How to reduce/aggregate a list of dicts per multip

2020-07-23 05:09发布

I have a list of dicts like this:

sales_per_store_per_day = [
   {'date':'2014-06-01', 'store':'a', 'product1':10, 'product2':3, 'product3':15},
   {'date':'2014-06-01', 'store':'b', 'product1':20, 'product2':4, 'product3':16},
   {'date':'2014-06-02', 'store':'a', 'product1':30, 'product2':5, 'product3':17},
   {'date':'2014-06-02', 'store':'b', 'product1':40, 'product2':6, 'product3':18},
]

How could I reduce this list to have a sum of products for each store, ignoring the date? The result for the above input would be:

sales_per_store = [
   {'store':'a', 'product1':40, 'product2':8, 'product3':32},
   {'store':'b', 'product1':60, 'product2':10, 'product3':34}
]

2条回答
Lonely孤独者°
2楼-- · 2020-07-23 05:34

Version without collections - maybe more readable for begginers.

sales_per_store_per_day = [
   {'date':'2014-06-01', 'store':'a', 'product1':10, 'product2':3, 'product3':15},
   {'date':'2014-06-01', 'store':'b', 'product1':20, 'product2':4, 'product3':16},
   {'date':'2014-06-02', 'store':'a', 'product1':30, 'product2':5, 'product3':17},
   {'date':'2014-06-02', 'store':'b', 'product1':40, 'product2':6, 'product3':18},
]

results = {}

for x in sales_per_store_per_day:

    # default value
    if x['store'] not in results:
        results[x['store']] = {'store': x['store'], 'product1':0, 'product2':0, 'product3':0}

    results[x['store']]['product1'] += x['product1']
    results[x['store']]['product2'] += x['product2']
    results[x['store']]['product3'] += x['product3']

print results

sales_per_store = results.values()

print sales_per_store

.

# results
{
  'a': {'product3': 32, 'product1': 40, 'store': 'a', 'product2': 8}, 
  'b': {'product3': 34, 'product1': 60, 'store': 'b', 'product2': 10}
}

# sales_per_store
[
  {'product3': 32, 'product1': 40, 'store': 'a', 'product2': 8}, 
  {'product3': 34, 'product1': 60, 'store': 'b', 'product2': 10}
]
查看更多
做个烂人
3楼-- · 2020-07-23 05:49

Use a collections.defaultdict() to track info per store, and collections.Counter() to ease summing of the numbers:

from collections import defaultdict, Counter

by_store = defaultdict(Counter)

for info in sales_per_store_per_day:
    counts = Counter({k: v for k, v in info.items() if k not in ('store', 'date')})
    by_store[info['store']] += counts

sales_per_store = [dict(v, store=k) for k, v in by_store.items()]

counts is a Counter() instance built from each of the products in the info dictionary; I'm assuming that everything except the store and date keys are product counts. It uses a dict comprehension to produce a copy with those two keys removed. The by_store[info['store']] looks up the current total counts for the given store (which default to a new, empty Counter() object).

The last line then produces your desired output; new dictionaries with 'store' and per-product counts, but you may want to just keep the dictionary mapping from store to Counter objects.

Demo:

>>> from collections import defaultdict, Counter
>>> sales_per_store_per_day = [
...    {'date':'2014-06-01', 'store':'a', 'product1':10, 'product2':3, 'product3':15},
...    {'date':'2014-06-01', 'store':'b', 'product1':20, 'product2':4, 'product3':16},
...    {'date':'2014-06-02', 'store':'a', 'product1':30, 'product2':5, 'product3':17},
...    {'date':'2014-06-02', 'store':'b', 'product1':40, 'product2':6, 'product3':18},
... ]
>>> by_store = defaultdict(Counter)
>>> for info in sales_per_store_per_day:
...     counts = Counter({k: v for k, v in info.items() if k not in ('store', 'date')})
...     by_store[info['store']] += counts
... 
>>> [dict(v, store=k) for k, v in by_store.items()]
[{'store': 'a', 'product3': 32, 'product2': 8, 'product1': 40}, {'store': 'b', 'product3': 34, 'product2': 10, 'product1': 60}]
查看更多
登录 后发表回答