I am trying to find the MIN and MAX values for each row of a CSV file and append them to the next position in the list, positions 5 and 6. I have managed to calculate the average, append this to the forth position and output this in highest to lowest however, I am struggling to work out how to find the MAX and MIN values of each row so I can do the same - highest to lowest. The original CSV is formatted: Fred,56,78,99 with each user on a new line.
Any help would be appreciate.
import csv
import operator
sample = open("sampleData.txt", "r")
csv1 = csv.reader(sample, delimiter = ',')
sort = sorted(csv1,key=operator.itemgetter(0))
for i in range( 0, len(sort)):
sort[i].append((int(sort[i][1]) + int(sort[i][2]) + int(sort[i][3])) / int(len(sort[i])-1))
sort = list(reversed(sorted(sort,key=operator.itemgetter(4))))
for i in range( 0, len( sort ) ):
print(sort[i][0], round(sort[i][4]))
import csv
sample = open("sampleData.txt", "r")
csv1 = csv.reader(sample, delimiter = ',')
sorted_list = []
for line in csv1:
print '-- ORIG:', line
tmp = sorted( [int(i) for i in line[1:]], reverse=True ) # eg: [99,78,56]
stat_list = [round(sum(tmp)/float(len(tmp)), 2), min(tmp), max(tmp)]
sorted_list.append( [line[0]] + tmp + stat_list )
for s in sorted_list: print '** NEW: ', s # has ['Fred',99,78,56,78.0,57,99]
You can use/modify the quick & dirty solution above. Note:
- The result is a list of lists with digits converted to integers.
- The float() is needed for computing average - just doing it on the denominator is enough for the whole result to be a float.
- List comprehensions are great shortcuts and efficient for avoiding for loops.
- The numpy module has builtin mean() function (among many others) that are useful and fast, especially for large arrays.
OUTPUT (space added)
- ORIG: ['Fred', '57', '78', '99']
- ORIG: ['Wilma', '96', '4', '105']
ORIG: ['Bar', '23', '88', '65']
NEW: ['Fred', 99, 78, 57, 78.0, 57, 99]
- NEW: ['Wilma', 105, 96, 4, 68.33, 4, 105]
- NEW: ['Bar', 88, 65, 23, 58.67, 23, 88]