I want to find the minimum of a list of tuples sorting by a given column. I have some data arranged as a list of 2-tuples for example.
data = [ (1, 7.57), (2, 2.1), (3, 1.2), (4, 2.1), (5, 0.01),
(6, 0.5), (7, 0.2), (8, 0.6)]
How may I find the min of the dataset by the comparison of the second number in the tuples only?
i.e.
data[0][1] = 7.57
data[1][1] = 2.1
min( data ) = (5, 0.01)
min( data )
returns (1, 7.57)
, which I accept is correct for the minimum of index 0, but I want minimum of index 1.
In [2]: min(data, key = lambda t: t[1])
Out[2]: (5, 0.01)
or:
In [3]: import operator
In [4]: min(data, key=operator.itemgetter(1))
Out[4]: (5, 0.01)
Even though Lev's answer is correct, I wanted to add the sort Method as well, in case someone is interested in the first n
minimas.
One thing to consider is that the min
operation's runtime is O(N)
where the sort's is O(N Log N)
data = [ (1, 7.57), (2, 2.1), (3, 1.2), (4, 2.1), (5, 0.01), (6, 0.5), (7, 0.2), (8, 0.6)]
data.sort(key=lambda x:x[1])
print data
>>> [(5, 0.01), (7, 0.2), (6, 0.5), (8, 0.6), (3, 1.2), (2, 2.1), (4, 2.1), (1, 7.57)]
https://www.ics.uci.edu/~pattis/ICS-33/lectures/complexitypython.txt
If you're willing to drink the numpy coolaid, you can use these commands to get the tuple in list where item is minimum:
The ingredients that make this work are numpy's advanced array slicing and argsort features.
import numpy as np
#create a python list of tuples and convert it to a numpy ndarray of floats
data = np.array([ (1, 7.57), (2, 2.1), (3, 1.2),
(4, 2.1), (5, 0.01), (6, 0.5), (7, 0.2), (8, 0.6)])
print("data is")
print(data)
#Generate sortIndices from second column
sortIndices = np.argsort(data[:,1])
print("sortIndices using index 1 is:" )
print(sortIndices)
print("The column at index 1 is:")
print(data[:,1])
print("Index 1 put into order using column 1")
print(data[sortIndices,1])
print("The tuples put into order using column 1")
print(data[sortIndices,:])
print("The tuple with minimum value at index 1")
print(data[sortIndices[0],:])
print("The tuple with maximum value at index 1")
print(data[sortIndices[-1],:])
Which prints:
data is
[[ 1. 7.57]
[ 2. 2.1 ]
[ 3. 1.2 ]
[ 4. 2.1 ]
[ 5. 0.01]
[ 6. 0.5 ]
[ 7. 0.2 ]
[ 8. 0.6 ]]
sortIndices using index 1 is:
[4 6 5 7 2 1 3 0]
The column at index 1 is:
[ 7.57 2.1 1.2 2.1 0.01 0.5 0.2 0.6 ]
Index 1 put into order using column 1
[ 0.01 0.2 0.5 0.6 1.2 2.1 2.1 7.57]
The tuples put into order using column 1
[[ 5. 0.01]
[ 7. 0.2 ]
[ 6. 0.5 ]
[ 8. 0.6 ]
[ 3. 1.2 ]
[ 2. 2.1 ]
[ 4. 2.1 ]
[ 1. 7.57]]
The tuple with minimum value at index 1
[ 5. 0.01]
The tuple with maximum value at index 1
[ 1. 7.57]