replace missing value with n-1

2020-04-11 18:14发布

for example : I have,

df = pd.DataFrame({0: [420, np.nan, 455, np.nan, np.nan, np.nan]})

df

       0
0  420.0
1    NaN
2  455.0
3    NaN
4    NaN
5    NaN

then using :

df[0].isnull().astype(int)

0    0
1    1
2    0
3    1
4    1
5    1
Name: 0, dtype: int64

I get

df[0].fillna(method='ffill') - df[0].isnull().astype(int)

0    420.0
1    419.0
2    455.0
3    454.0
4    454.0
5    454.0
Name: 0, dtype: float64

I am looking for to get 0,1,0,1,2,3, then in the end :

df[0]= 420, 419, 455; 454,453, 452

标签: python pandas
2条回答
Ridiculous、
2楼-- · 2020-04-11 18:40

If you can using cumsum as well

s=df[0].isnull().astype(int).groupby(df[0].notnull().cumsum()).cumsum()
s
Out[430]: 
0    0
1    1
2    0
3    1
4    2
5    3
Name: 0, dtype: int32

#df[0].ffill() - s
查看更多
贪生不怕死
3楼-- · 2020-04-11 19:00

groupby, cumcount

df[0].ffill() - df.groupby(df[0].notna().cumsum()).cumcount()

0    420.0
1    419.0
2    455.0
3    454.0
4    453.0
5    452.0
dtype: float64

Details

Define groups
df[0].notna().cumsum()

0    1
1    1
2    2
3    2
4    2
5    2
Name: 0, dtype: int64
Use in groupby with cumcount
df.groupby(df[0].notna().cumsum()).cumcount()

0    0
1    1
2    0
3    1
4    2
5    3
dtype: int64
查看更多
登录 后发表回答