I would like to count the number of matches after a groupby in a pandas dataframe.
claim event material1 material2
A X M1 M2
A X M2 M3
A X M3 M0
A X M4 M4
A Y M5 M5
A Y M6 M0
B Z M7 M0
B Z M8 M0
First, I group by the pair claim event and for each of these groups I want to count the number of matches between the columns material1 and material 2
For the group by, I have grouped = df.groupby(['claim', 'event'])
but then I don't know how to compare the two new columns.
It should return the following dataframe :
claim event matches
A X 3
A Y 1
B Z 0
Do you have any idea how to do that ?
Use
isin
for compare columns and groupby by columns with aggregatesum
, last cast toint
andreset_index
for columns fromMultiIndex
:Solution with assign to new column:
Solutions by @Wen, thank you:
I think it should be slowier in larger
DataFrame
s: