I want to get a running total from a list of numbers.
For demo purposes, I start with a sequential list of numbers using range
a = range(20)
runningTotal = []
for n in range(len(a)):
new = runningTotal[n-1] + a[n] if n > 0 else a[n]
runningTotal.append(new)
# This one is a syntax error
# runningTotal = [a[n] for n in range(len(a)) if n == 0 else runningTotal[n-1] + a[n]]
for i in zip(a, runningTotal):
print "{0:>3}{1:>5}".format(*i)
yields
0 0
1 1
2 3
3 6
4 10
5 15
6 21
7 28
8 36
9 45
10 55
11 66
12 78
13 91
14 105
15 120
16 136
17 153
18 171
19 190
As you can see, I initialize an empty list []
, then append()
in each loop iteration. Is there a more elegant way to this, like a list comprehension?
I'm not sure about 'elegant', but I think the following is much simpler and more intuitive (at the cost of an extra variable):
The functional way to do the same thing is:
...but that's much less readable/maintainable, etc.
@Omnifarous suggests this should be improved to:
...but I still find that less immediately comprehensible than my initial suggestion.
Remember the words of Kernighan: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."
I would use a coroutine for this:
Another one-liner, in linear time and space.
I'm stressing linear space here, because most of the one-liners I saw in the other proposed answers --- those based on the pattern
list + [sum]
or usingchain
iterators --- generate O(n) lists or generators and stress the garbage collector so much that they perform very poorly, in comparison to this.I wanted to do the same thing to generate cumulative frequencies that I could use bisect_left over - this is the way I've generated the list;
When we take the sum of a list, we designate an accumulator (
memo
) and then walk through the list, applying the binary function "x+y" to each element and the accumulator. Procedurally, this looks like:This is a common pattern, and useful for things other than taking sums — we can generalize it to any binary function, which we'll supply as a parameter, and also let the caller specify an initial value. This gives us a function known as
reduce
,foldl
, orinject
[1]:In Python 2,
reduce
was a built-in function, but in Python 3 it's been moved to thefunctools
module:We can do all kinds of cool stuff with
reduce
depending on the function we supply as its the first argument. If we replace "sum" with "list concatenation", and "zero" with "empty list", we get the (shallow)copy
function:If we add a
transform
function as another parameter tocopy
, and apply it before concatenating, we getmap
:If we add a
predicate
function that takese
as a parameter and returns a boolean, and use it to decide whether or not to concatenate, we getfilter
:map
andfilter
are sort of unfancy ways of writing list comprehensions — we could also have said[x*2 for x in range(10)]
or[x for x in range(10) if x%2==0]
. There's no corresponding list comprehension syntax forreduce
, becausereduce
isn't required to return a list at all (as we saw withsum
, earlier, which Python also happens to offer as a built-in function).It turns out that for computing a running sum, the list-building abilities of
reduce
are exactly what we want, and probably the most elegant way to solve this problem, despite its reputation (along withlambda
) as something of an un-pythonic shibboleth. The version ofreduce
that leaves behind copies of its old values as it runs is calledreductions
orscanl
[1], and it looks like this:So equipped, we can now define:
While conceptually elegant, this precise approach fares poorly in practice with Python. Because Python's
list.append()
mutates a list in place but doesn't return it, we can't use it effectively in a lambda, and have to use the+
operator instead. This constructs a whole new list, which takes time proportional to the length of the accumulated list so far (that is, an O(n) operation). Since we're already inside the O(n)for
loop ofreduce
when we do this, the overall time complexity compounds to O(n2).In a language like Ruby[2], where
array.push e
returns the mutatedarray
, the equivalent runs in O(n) time:same in JavaScript[2], whose
array.push(e)
returnse
(notarray
), but whose anonymous functions allow us to include multiple statements, which we can use to separately specify a return value:So, how can we solve this while retaining the conceptual simplicity of a
reductions
function that we just passlambda x, y: x + y
to in order to create the running sum function? Let's rewritereductions
procedurally. We can fix the accidentally quadratic problem, and while we're at it, pre-allocate the result list to avoid heap thrashing[3]:This is the sweet spot for me: O(n) performance, and the optimized procedural code is tucked away under a meaningful name where it can be re-used the next time you need to write a function that accumulates intermediate values into a list.
reduce
/reductions
come from the LISP tradition,foldl
/scanl
from the ML tradition, andinject
from the Smalltalk tradition.List
and Ruby'sArray
are both implementations of an automatically resizing data structure known as a "dynamic array" (orstd::vector
in C++). JavaScript'sArray
is a little more baroque, but behaves identically provided you don't assign to out of bounds indices or mutateArray.length
.This can be implemented in 2 lines in Python.
Using a default parameter eliminates the need to maintain an aux variable outside, and then we just do a
map
to the list.