Iterating vs List Concatenation

2019-01-27 16:26发布


So there are two ways to take a list and add the members of a second list to the first. You can use list concatenation or your can iterate over it. You can:

for obj in list2:

or you can:

list1 = list1 + list2


list1 += list2

My question is: which is faster, and why? I tested this using two extremely large lists (upwards of 10000 objects) and it seemed the iterating method was a lot faster than the list concatenation (as in l1 = l1 + l2). Why is this? Can someone explain?


append adds each item one at a time, which is the cause of its slowness, as well as the repeated function calls to append.

However in this case the += operator is not syntactic sugar for the +. The += operator does not actually create a new list then assign it back, it modifies the left hand operand in place. It's pretty apparent when using timeit to use both 10,000 times.

>>> timeit.timeit(stmt="l = l + j", setup="l=[1,2,3,4]; j = [5,6,7,8]", number=10000)
>>> timeit.timeit(stmt="l += j", setup="l=[1,2,3,4]; j = [5,6,7,8]", number=10000)

+= is much faster (about 500x)

You also have the extend method for lists which can append any iterable (not just another list) with something like l.extend(l2)

>>> timeit.timeit(stmt="l.extend(j)", setup="l=[1,2,3,4]; j = [5,6,7,8]", number=10000)
>>> timeit.timeit(stmt="for e in j: l.append(e)", setup="l=[1,2,3,4]; j = [5,6,7,8]", number=10000)

Logically equivalent to appending, but much much faster as you can see.

So to explain this: iterating is faster than + because + has to construct an entire new list

extend is faster than iteration because it's a builtin list method and has been optimized. Logically equivalent to appending repeatedly, but implemented differently.

+= is faster than extend because it can modify the list in place, knowing how much larger the list has to be and without repeated function calls. It assumes you're appending your list with another list/tuple


I ran the following code

l1 = list(range(0, 100000))
l2 = list(range(0, 100000))

def t1():
    starttime = time.monotonic()
    for item in l1:
    print(time.monotonic() - starttime)

l1 = list(range(0, 100000))
l2 = list(range(0, 100000))

def t2():
    starttime = time.monotonic()
    global l1
    l1 += l2
    print(time.monotonic() - starttime)

and got this, which says that adding lists (+=) is faster.




You're measuring wrong; iterating and calling append multiple times is way slower than doing it one call since the overhead of the many function call (at least in cpython) dwarfs anything that has to do with the actual list operation, as shown here with cPython 2.7.5 on Linux x64:

$ python -m timeit -s 'x = range(10000);y = range(10000)' 'for e in y:x.append(e)'
100 loops, best of 3: 2.56 msec per loop
$ python -m timeit -s 'x = range(10000);y = range(10000)' 'x = x + y'
100 loops, best of 3: 8.98 msec per loop
$ python -m timeit -s 'x = range(10000);y = range(10000)' 'x += y'
10000 loops, best of 3: 105 usec per loop
$ python -m timeit -s 'x = range(10000);y = range(10000)' 'x.extend(y)'
10000 loops, best of 3: 107 usec per loop

Note that x = x + y creates a second copy of the list (at least in cPython). x.extend(y) and its cousin x += y do the same thing as calling append multiple times, just without the overhead of actually calling a Python method.