Get index where value changes in pandas dataframe

2019-04-10 11:26发布

I am trying to expand my pandas skills. I have a pandas dataframe that looks like this :

df

      Group 1     Group 2            Product ID
0   Products      International      X11
1   Products      International      X11
2   Products      Domestic           X11
3   Products      Domestic           X23
4   Services      Professional       X23
5   Services      Professional       X23
6   Services      Analytics          X25

I am trying to use some pandas functionality to get the index where the values of Group 1 and Group 2 change. I understand that I will probably have to go column by column, and append these indices into different lists.

I have referenced this question Find index where elements change value pandas dataframe which was the closest similar question that I can find.

I am trying to get an output like this:

 Group 1 changes = [0,4]
 Group 2 changes = [0,2,4,6]

Is there any sort of built in functionality that pandas has that can quickly reference if two values in a column are the same, and then grab that index?

All of my data is sorted by group, so shouldn't run into any problems if the solution does involve iterating row by row.

Any help is much appreciated!

2条回答
仙女界的扛把子
2楼-- · 2019-04-10 12:04

This is one non-pandas solution. I like it because it is intuitive and requires no understanding of the large pandas library.

changes = {}

for col in df.columns:
    changes[col] = [0] + list(idx for idx, (i, j) in enumerate(zip(df[col], df[col][1:]), 1) if i != j)

# {'Group 1': [0, 4], 'Group 2': [0, 2, 4, 6], 'Product ID': [0, 3, 6]}
查看更多
仙女界的扛把子
3楼-- · 2019-04-10 12:07

Use

In [91]: df.ne(df.shift()).apply(lambda x: x.index[x].tolist())
Out[91]:
Group 1             [0, 4]
Group 2       [0, 2, 4, 6]
Product ID       [0, 3, 6]
dtype: object

In [92]: df.ne(df.shift()).filter(like='Group').apply(lambda x: x.index[x].tolist())
Out[92]:
Group 1          [0, 4]
Group 2    [0, 2, 4, 6]
dtype: object

Also for dict,

In [107]: {k: s.index[s].tolist() for k, s in df.ne(df.shift()).filter(like='Group').items()}
Out[107]: {'Group 1': [0L, 4L], 'Group 2': [0L, 2L, 4L, 6L]}
查看更多
登录 后发表回答