DataFrame:
B = pd.DataFrame({'b':['II','II','II','II','II','I','I','I'],
'MOST_FREQUENT':['1', '2', '2', '1', '1','1','2','2']})
I need to get the most frequent value in a column MOST_FREQUENT
for each group:
pd.DataFrame({'b':['I','II'],
'MOST_FREQUENT':['2','1']})
The only clue i found - mode()
, but is not applieble to DataFrameGroupBy
EDIT: I need a solution, which satisfies the pandas' .agg()
function
Trying to squeeze a little more performance out of pandas, we can use
groupby
with size to get the counts. then useidxmax
to find the index values of the largest sub-groups. These indices will be the values we're looking for.naive timing
You can use
apply
:Another solution is use
SeriesGroupBy.value_counts
and return firstindex
value, becausevalue_counts
sorts values:EDIT: You can use
most_common