I have the following (simplified) dataframe:
df = pd.DataFrame({'X': [1, 2, 3, 4, 5,6,7,8,9,10],
'Y': [10,20,30,40,50,-10,-20,-30,-40,-50],
'Z': [20,18,16,14,12,10,8,6,4,2]},index=list('ABCDEFGHIJ'))
Which gives the following:
X Y Z
A 1 10 20
B 2 20 18
C 3 30 16
D 4 40 14
E 5 50 12
F 6 -10 10
G 7 -20 8
H 8 -30 6
I 9 -40 4
J 10 -50 2
I want to create a new dataframe that returns the index of the n smallest values, by column.
Desired output (say, 3 smallest values):
X Y Z
0 A J J
1 B I I
2 C H H
What is the best way to do this?
Faster numpy solution with
numpy.argsort
:Timings:
You can use
apply
withnsmallest
:First, you want to sort your input dataframe per column, then get a list of all of the indices of each column, create a dataframe from these indices, then return the top n rows from the resultant dataframe.
Returns: