in my code I obtain two different lists from different sources, but I know they are in the same order. The first list ("names") contains a list of keys strings, while the second ("result_values") is a series of floats. I need to make the pair unique, but I can't use a dictionary as only the last value inserted would be kept: instead, I need to make an average (arithmetic mean) of the values that have a duplicate key.
Example of the wanted results:
names = ["pears", "apples", "pears", "bananas", "pears"]
result_values = [2, 1, 4, 8, 6] # ints here but it's the same conceptually
combined_result = average_duplicates(names, result_values)
print combined_result
{"pears": 4, "apples": 1, "bananas": 8}
My only ideas involve multiple iterations and so far have been ugly... is there an elegant solution to this problem?
You can have
avg()
returnfloat(len(series))
if you want a floating point average.You could calculate the mean using a Cumulative moving average to only iterate through the lists once:
I would use a dictionary anyways
If you're concerned with large lists, then I would replace
zip
withizip
from itertools.I think what you're looking for is
itertools.groupby
:See also
zip
andsorted
.