I would like to obtain the n-th minimum or the n-th maximum value from numerical columns in the DataFrame
in pandas.
Example:
df = pd.DataFrame({'a': [3.0, 2.0, 4.0, 1.0],'b': [1.0, 4.0 , 2.0, 3.0]})
a b
0 3.0 1.0
1 2.0 4.0
2 4.0 2.0
3 1.0 3.0
The third largest value in column a
is 2 and the second smallest value in column b
is also 2.
You can use nlargest
/nsmallest
-
df
a b
0 3.0 1.0
1 2.0 4.0
2 4.0 2.0
3 1.0 3.0
df.a.nlargest(3).iloc[-1]
2.0
Or,
df.a.nlargest(3).iloc[[-1]]
1 2.0
Name: a, dtype: float64
And, as for b
-
df.b.nsmallest(2).iloc[-1]
2.0
Or,
df.b.nsmallest(2).iloc[[-1]]
2 2.0
Name: b, dtype: float64
Quick observation here - this sort of operation cannot be vectorised. You are essentially performing two completely different operations here.
df =
a b
0 3.0 1.0
1 2.0 4.0
2 4.0 2.0
3 1.0 3.0
df.nlargest(3,'a')
=2.0
df.nsmallest(2,'b')=2.0