I'm trying to solve an algorithm problem,consider the following list:
l = [100, 20, 50, 70, 45]
in this problem I have to find the average of the elements up to index i:
i = 0
100
i = 1
(100 + 20) //2 = 60
i = 2
(100+20+50) // 3 = 56
...
the final result should be stored in a list:
[100, 60, 56, 60, 57]
this is my code so far:
from functools import reduce
def meanScores(l):
def av(x):
return reduce(lambda a, b: a+b,x)//len(x)
return [av(l[:i]) for i in range(1,len(l)+1)]
It works fine the problem is that when I submitted it, I faced a time limit execution.I think the problem is the for loop since it takes a lot of time when len(l)
is more than ten-thousand. Previously I used sum()
to do the averaging but that took a lot of time too, when I turned that sum()
to reduce(lambda a, b: a+b,x)//len(x)
the algorithm got faster(It solved more test cases).I think that if instead of an for loop I use another function(like lambda) then the problem is solved.So do you think there is a way? thank you for your time.
What you want to do is iterate only once on the list:
For using the previous results i take the previous sum (which is the prev avg * i), add the new number, and divide it by i+1.
It can be implemented with linear time complexity i.e O(n). A sample can be as follows just to give you an idea, how can you do it.
As complete solutions have already spoiled the game, here is a working one in O(n).
Note that you can very often avoid manipulating indices yourself in Python. Here, we can use
enumerate
with a start value of 1 to keep track of the number of values we summed.Some tests:
Some ideas:
av
function, it reduces the whole list. Since callingav
in your list comprehension, you're callingav
more times than you need. You should be calculating your list of sums (usingav
) only once, and deriving your sums as you iterate throughl
.i
, you shouldn't be reducing the whole list. You should be slicing the first listl[:i]
, then running yourreduce()
against the shortened list.You can try to use
numpy.cumsum
and get the average dividing by the index+1 of the cumsum list.It should be pretty fast, O(n), since
numpy.cumsum
is very optimized already. If you still want it faster you could multithread it.That is because you sum the whole array each time so the cost is quadratic, but can be linear if you reuse the previous result each time. Think about it.