可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Suppose I have the following list:
[{'name': 'Amy', 'count': 1}, {'name': 'Amy', 'count': 2}, {'name': 'Peter', 'count': 1}]
How could I group it and sum the count in order to get the following out:
[{'name': 'Amy', 'count': 3}, {'name': 'Peter', 'count': 1}]
Thanks.
回答1:
You can use a collecions.Counter
:
from collections import Counter
l = [
{'name': 'Amy', 'count': 1},
{'name': 'Amy', 'count': 2},
{'name': 'Peter', 'count': 1}
]
c = Counter()
for v in l:
c[v['name']] += v['count']
Result:
>>> c
Counter({'Amy': 3, 'Peter': 1})
>>> [{'name': name, 'count': count} for name, count in c.items()]
[{'count': 3, 'name': 'Amy'}, {'count': 1, 'name': 'Peter'}]
回答2:
You can alternatively use Pandas groupby function:
df = pd.DataFrame([{'name': 'Amy', 'count': 1},
{'name': 'Amy', 'count': 2},
{'name': 'Peter', 'count': 1}])
df.groupby("name").sum()
count
name
Amy 3
Peter 1
回答3:
You could pivot the list using a defaultdict
as explained in the doc:
>>> l = [{'name': 'Amy', 'count': 1},
{'name': 'Amy', 'count': 2},
{'name': 'Peter', 'count': 1}]
# Pivot operation
>>> pivot = collections.defaultdict(list)
>>> for item in l:
... pivot[item['name']].append(item['count'])
...
>>> pivot
defaultdict(<class 'list'>, {'Peter': [1], 'Amy': [1, 2]})
After that, you simply have to rebuild our desired output using a comprehension list:
>>> [{'name':k, 'count':sum(values)} for k, values in pivot.items()]
[{'name': 'Peter', 'count': 1}, {'name': 'Amy', 'count': 3}]
I must admit this is not necessary the most efficient way of doing, but given your data-structure, I guess the pivot operation would be useful in several other scenarios, not necessary implying summing things.
回答4:
I wanted to suggest that you could use a defaultdict
as has Sylvain Leroux in his answer.
However, it is not necessary to collect the counts into a list, you can sum them as you go using a defaultdict(int)
:
from collections import defaultdict
l = [{'name': 'Amy', 'count': 1}, {'name': 'Amy', 'count': 2}, {'name': 'Peter', 'count': 1}]
counts = defaultdict(int)
for d in l:
counts[d['name']] += d['count']
counts = [{'name': k, 'count': v} for k,v in counts.items()]
>>> print counts
[{'count': 3, 'name': 'Amy'}, {'count': 1, 'name': 'Peter'}]
This should be more efficient than building lists and summing them.
itertools.groupby
is another option, but it does require an upfront list sort by the name
key which might be less efficient on longer lists.
回答5:
import itertools as it
import operator as op
l = [{'name': 'Amy', 'count': 1}, {'name': 'Amy', 'count': 2}, {'name': 'Peter', 'count': 1}]
Get the list sorted by 'name' key of the dict.
sl = sorted(l,key=op.itemgetter('name'))
Pass the sorted list to gorupby
with the key as the 'name' key of the dict which returns a tuple of key and an iterator of list items grouped by 'name' key of the dict. f.e. ('Amy',<itertools._grouper object at 0xb5fdac2c>)
.
The iterator yields one item per iteration all the elements of the list which has 'Amy' as the value for 'name' key of the dict.
To get the total of the 'count' key, we have to call sum
with new list of all the 'count' fields like sum(map(op.itemgetter('count'),g))
.
To build a list of dict call dict
with fitst element of tuple returned by the groupby
as a value for 'name' key and the sum returned by sum
as value for 'count' key for the new dict.
[ dict(name=k,count=sum(map(op.itemgetter('count'),g)))
for k,g in it.groupby(sl, key=op.itemgetter('name'))]