This question already has an answer here:
Is there a simple way to flatten a list of iterables with a list comprehension, or failing that, what would you all consider to be the best way to flatten a shallow list like this, balancing performance and readability?
I tried to flatten such a list with a nested list comprehension, like this:
[image for image in menuitem for menuitem in list_of_menuitems]
But I get in trouble of the NameError
variety there, because the name 'menuitem' is not defined
. After googling and looking around on Stack Overflow, I got the desired results with a reduce
statement:
reduce(list.__add__, map(lambda x: list(x), list_of_menuitems))
But this method is fairly unreadable because I need that list(x)
call there because x is a Django QuerySet
object.
Conclusion:
Thanks to everyone who contributed to this question. Here is a summary of what I learned. I'm also making this a community wiki in case others want to add to or correct these observations.
My original reduce statement is redundant and is better written this way:
>>> reduce(list.__add__, (list(mi) for mi in list_of_menuitems))
This is the correct syntax for a nested list comprehension (Brilliant summary dF!):
>>> [image for mi in list_of_menuitems for image in mi]
But neither of these methods are as efficient as using itertools.chain
:
>>> from itertools import chain
>>> list(chain(*list_of_menuitems))
And as @cdleary notes, it's probably better style to avoid * operator magic by using chain.from_iterable
like so:
>>> chain = itertools.chain.from_iterable([[1,2],[3],[5,89],[],[6]])
>>> print(list(chain))
>>> [1, 2, 3, 5, 89, 6]
Off the top of my head, you can eliminate the lambda:
Or even eliminate the map, since you've already got a list-comp:
You can also just express this as a sum of lists:
If each item in the list is a string (and any strings inside those strings use " " rather than ' '), you can use regular expressions (
re
module)The above code converts in_list into a string, uses the regex to find all the substrings within quotes (i.e. each item of the list) and spits them out as a list.
You almost have it! The way to do nested list comprehensions is to put the
for
statements in the same order as they would go in regular nestedfor
statements.Thus, this
corresponds to
So you want
What about:
But, Guido is recommending against performing too much in a single line of code since it reduces readability. There is minimal, if any, performance gain by performing what you want in a single line vs. multiple lines.
A simple alternative is to use numpy's concatenate but it converts the contents to float:
If you're just looking to iterate over a flattened version of the data structure and don't need an indexable sequence, consider itertools.chain and company.
It will work on anything that's iterable, which should include Django's iterable
QuerySet
s, which it appears that you're using in the question.Edit: This is probably as good as a reduce anyway, because reduce will have the same overhead copying the items into the list that's being extended.
chain
will only incur this (same) overhead if you runlist(chain)
at the end.Meta-Edit: Actually, it's less overhead than the question's proposed solution, because you throw away the temporary lists you create when you extend the original with the temporary.
Edit: As J.F. Sebastian says
itertools.chain.from_iterable
avoids the unpacking and you should use that to avoid*
magic, but the timeit app shows negligible performance difference.