I have to following df:
Col1 Col2
test Something
test2 Something
test3 Something
test Something
test2 Something
test5 Something
I want to get
Col1 Col2 Occur
test Something 2
test2 Something 2
test3 Something 1
test Something 2
test2 Something 2
test5 Something 1
I've tried to use:
df["Occur"] = df["Col1"].value_counts()
But it didn't help. I've got Occur column full of 'NaN'
groupby
on 'col1' and then apply transform
on Col2
to return a Series with its index aligned to the original df so you can add it as a column:
In [3]:
df['Occur'] = df.groupby('Col1')['Col2'].transform(pd.Series.value_counts)
df
Out[3]:
Col1 Col2 Occur
0 test Something 2
1 test2 Something 2
2 test3 Something 1
3 test Something 2
4 test2 Something 2
5 test5 Something 1
You can also use GroupBy
+ transform
with size
:
df['Occur'] = df.groupby('Col1')['Col1'].transform('size')
print(df)
Col1 Col2 Occur
0 test Something 2
1 test2 Something 2
2 test3 Something 1
3 test Something 2
4 test2 Something 2
5 test5 Something 1
You can also do:
df['Occur']=df['Col1'].apply(df['Col1'].tolist().count)
So then:
print(df)
Is:
Col1 Col2 Occur
0 test Something 2
1 test2 Something 2
2 test3 Something 1
3 test Something 2
4 test2 Something 2
5 test5 Something 1