From this post I learned that you can concatenate tuples with:
>>> tuples = (('hello',), ('these', 'are'), ('my', 'tuples!'))
>>> sum(tuples, ())
('hello', 'these', 'are', 'my', 'tuples!')
Which looks pretty nice. But why does this work? And, is this optimum, or is there something from itertools
that would be preferable to this construct?
Just to complement the accepted answer with some more benchmarks:
EDIT: the code is updated to actually use tuples. And, as per comments, the last two options are now inside a
tuple()
constructors, and all the times have been updated (for consistency). Theitertools.chain*
options are still the fastest but now the margin is reduced.That's clever and I had to laugh because help expressly forbids strings, but it works
You can add tuples to get a new, bigger tuple. And since you gave a tuple as a start value, the addition works.
It works because addition is overloaded (on tuples) to return the concatenated tuple:
That's basically what
sum
is doing, you give an initial value of an empty tuple and then add the tuples to that.However this is generally a bad idea because addition of tuples creates a new tuple, so you create several intermediate tuples just to copy them into the concatenated tuple:
That's an implementation that has quadratic runtime behavior. That quadratic runtime behavior can be avoided by avoiding the intermediate tuples.
Using nested generator expressions:
Or using a generator function:
Or using
itertools.chain.from_iterable
:And if you're interested how these perform (using my
simple_benchmark
package):(Python 3.7.2 64bit, Windows 10 64bit)
So while the
sum
approach is very fast if you concatenate only a few tuples it will be really slow if you try to concatenate lots of tuples. The fastest of the tested approaches for many tuples isitertools.chain.from_iterable
the addition operator concatenates tuples in python:
From the docstring of
sum
:It means
sum
doesn't start with the first element of your iterable, but rather with an initial value that is passed throughstart=
argument.By default
sum
is used with numeric thus the default start value is0
. So summing an iterable of tuples requires to start with an empty tuple.()
is an empty tuple:Therefore the working concatenation.
As per performance, here is a comparison:
Now with t2 of a size 10000:
So if your list of tuples is small, you don't bother. If it's medium size or larger, you should use
itertools
.