Calculating numeric differences per group in panda

2019-08-12 16:20发布

My Dataframe has the following structure:

patient_id  |  timestamp  |  measurement
A           |  2014-10-10 |  5.7
A           |  2014-10-11 |  6.3
B           |  2014-10-11 |  6.1
B           |  2014-10-10 |  4.1

I would like to calculate a delta (difference) between each measurement of each patient.

The result should look like:

patient_id  |  timestamp  |  measurement  |    delta
A           |  2014-10-10 |  5.7          |     NaN
A           |  2014-10-11 |  6.3          |     0.6
B           |  2014-10-11 |  6.1          |     2.0
B           |  2014-10-10 |  4.1          |     NaN

How can this be done most-elegantly in pandas ?

标签： python python-2.7 pandas time-series dataframe

1条回答

做自己的国王

2楼-- · 2019-08-12 16:50

Call transform on the 'measurement' column and pass the method diff, transform returns a series with an index aligned to the original df:

In [4]:

df['delta'] = df.groupby('patient_id')['measurement'].transform(pd.Series.diff)
df
Out[4]:
  patient_id   timestamp  measurement  delta
0          A  2014-10-10          5.7    NaN
1          A  2014-10-11          6.3    0.6
2          B  2014-10-10          4.1    NaN
3          B  2014-10-11          6.1    2.0

EDIT

If you are intending to apply some sorting on the result of transform then sort the df first:

In [10]:

df['delta'] = df.sort(columns=['patient_id', 'timestamp']).groupby('patient_id')['measurement'].transform(pd.Series.diff)
df
Out[10]:
  patient_id   timestamp  measurement  delta
0          A  2014-10-10          5.7    NaN
1          A  2014-10-11          6.3    0.6
2          B  2014-10-11          6.1    2.0
3          B  2014-10-10          4.1    NaN

0人赞添加讨论(0) 举报

Calculating numeric differences per group in panda

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间