Avoid lexicographic ordering of numerical values w

2019-08-02 03:43发布

问题:

I have a script to pull random numbers from a set of values. However, it broke today because min() and max() sort values by lexicographic order (so 200 is considered greater than 10000). How can I avoid lexicographic order here? Len key is on the right track but not quite right. I couldn't find any other key(s) that would help.

data_set = 1600.csv, 2405.csv, 6800.csv, 10000.csv, 21005.csv

First try:

highest_value = os.path.splitext(max(data_set))[0]
lowest_value = os.path.splitext(min(data_set))[0]

returns: lowest_value = 10000 highest_value = 6800

Second try:

highest_value = os.path.splitext(max(data_set,key=len))[0]
lowest_value = os.path.splitext(min(data_set,key=len))[0]

returns: lowest_value = 1600 highest_value = 10000

Thanks.

回答1:

You can use key to order by the numeric part of the file:

data_set = ['1600.csv', '2405.csv', '6800.csv', '10000.csv', '21005.csv']

highest = max(data_set, key=lambda x: int(x.split('.')[0]))
lowest = min(data_set, key=lambda x: int(x.split('.')[0]))

print(highest) # >> 21005.csv
print(lowest)  # >> 1600.csv


回答2:

You were close. Rather than using the result of splittext with the len function, use the int function instead:

>>> from os.path import splitext
>>> data_set = ['1600.csv', '2405.csv', '6800.csv', '10000.csv', '21005.csv']
>>> def convert_to_int(file_name):
        return int(splitext(file_name)[0])

>>> min(data_set, key=convert_to_int)
'1600.csv'
>>> max(data_set, key=convert_to_int)
'21005.csv'

Of course, this solution assumes that your file name will consist solely of numerical values.