How to reference groupby index when using apply, t

2019-05-30 16:30发布

To be concrete, say we have two DataFrames:

df1:

    date    A
0   12/1/14 3
1   12/1/14 1
2   12/3/14 2
3   12/3/14 3
4   12/3/14 4
5   12/6/14 5

df2:

Now I want to groupby date in df1, and take a sum of value A in each group and then normalize it by the value of B in df2 in the corresponding date. Something like this

df1.groupby('date').agg(lambda x: np.sum(x)/df2.loc[x.date,'B'])

The question is that neither aggregate, apply, nor transform can reference to the index. Any idea how to work around this?

标签： python pandas group-by dataframe aggregate

2条回答

Animai°情兽

2楼-- · 2019-05-30 17:23

> df_grouped = df1.groupby('date').sum()
> print df_grouped['A] /df2['B'].astype(float)
date
12/1/14    0.40
12/2/14     NaN
12/3/14    0.90
12/4/14     NaN
12/5/14     NaN
12/6/14    0.25
dtype: float64

0人赞添加讨论(0) 举报

Bombasti

3楼-- · 2019-05-30 17:25

When you call .groupby('column') it makes column to be part of DataFrameGroupBy index. And it is accessible through .index property.

So, in your case, assuming that date is NOT part of index in either df this should work:

def f(x):
    return x.sum() / df2.set_index('date').loc[x.index[0], 'B']

df1.set_index('date').groupby(level='date').apply(f)

This produces:

               A
date            
2014-01-12  0.40
2014-03-12  0.90
2014-06-12  0.25

If date is in index of df2 - just use df2.loc[x.index[0], 'B'] in the above code.

If date is in df1.index change the last line to df1.groupby(level='date').apply(f).

0人赞添加讨论(0) 举报

How to reference groupby index when using apply, t

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间