Python - Group by and sum a list of tuples

Given the following list:

[
    ('A', '', Decimal('4.0000000000'), 1330, datetime.datetime(2012, 6, 8, 0, 0)),
    ('B', '', Decimal('31.0000000000'), 1330, datetime.datetime(2012, 6, 4, 0, 0)),
    ('AA', 'C', Decimal('31.0000000000'), 1330, datetime.datetime(2012, 5, 31, 0, 0)),
    ('B', '', Decimal('7.0000000000'), 1330, datetime.datetime(2012, 5, 24, 0, 0)),
    ('A', '', Decimal('21.0000000000'), 1330, datetime.datetime(2012, 5, 14, 0, 0))
]

I would like to group these by the first, second, fourth and fifth columns in the tuple and sum the 3rd. For this example I'll name the columns as col1, col2, col3, col4, col5.

In SQL I would do something like this:

select col1, col2, sum(col3), col4, col5 from my table
group by col1, col2, col4, col5

Is there a "cool" way to do this or is it all a manual loop?

标签： python group-by list-comprehension

3条回答

三岁会撩人

2楼-- · 2019-01-18 05:54

If you find yourself doing this a lot with large datasets, you might want to look at the pandas library, which has lots of nice facilities for doing this kind of thing.

0人赞添加讨论(0) 举报

Explosion°爆炸

3楼-- · 2019-01-18 05:55

You want itertools.groupby.

Note that groupby expects the input to be sorted, so you may need to do that before hand:

keyfunc = lambda t: (t[0], t[1], t[3], t[4])
data.sort(key=keyfunc)
for key, rows in itertools.groupby(data, keyfunc):
    print key, sum(r[2] for r in rows)

0人赞添加讨论(0) 举报

太酷不给撩

4楼-- · 2019-01-18 06:08

>>> [(x[0:2] + (sum(z[2] for z in y),) + x[2:5]) for (x, y) in
      itertools.groupby(sorted(L, key=operator.itemgetter(0, 1, 3, 4)),
      key=operator.itemgetter(0, 1, 3, 4))]
[
  ('A', '', Decimal('21.0000000000'), 1330, datetime.datetime(2012, 5, 14, 0, 0)),
  ('A', '', Decimal('4.0000000000'), 1330, datetime.datetime(2012, 6, 8, 0, 0)),
  ('AA', 'C', Decimal('31.0000000000'), 1330, datetime.datetime(2012, 5, 31, 0, 0)),
  ('B', '', Decimal('7.0000000000'), 1330, datetime.datetime(2012, 5, 24, 0, 0)),
  ('B', '', Decimal('31.0000000000'), 1330, datetime.datetime(2012, 6, 4, 0, 0))
]

(NOTE: output reformatted)

0人赞添加讨论(0) 举报

Python - Group by and sum a list of tuples

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间