This question already has an answer here:
So I got these examples from the official documentation. https://docs.python.org/2/library/timeit.html
What exactly makes the first example (generator expression) slower than the second (list comprehension)?
>>> timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
0.8187260627746582
>>> timeit.timeit('"-".join([str(n) for n in range(100)])', number=10000)
0.7288308143615723
The
str.join
method converts its iterable parameter to a list if it's not a list or tuple already. This lets the joining logic iterate over the items multiple times (it makes one pass to calculate the size of the result string, then a second pass to actually copy the data).You can see this in the CPython source code:
The
PySequence_Fast
function in the C API does just what I described. It converts an arbitrary iterable into a list (essentially by callinglist
on it), unless it already is a list or tuple.The conversion of the generator expression to a list means that the usual benefits of generators (a smaller memory footprint and the potential for short-circuiting) don't apply to
str.join
, and so the (small) additional overhead that the generator has makes its performance worse.