This question already has an answer here:
- Python: split a list based on a condition? 27 answers
I want to create two lists listOfA
and listOfB
to store indices of A
and B
from another list s.
s=['A','B','A','A','A','B','B']
Output should be two lists
listOfA=[0,2,3,4]
listOfB=[1,5,6]
I am able to do this with two statements.
listOfA=[idx for idx,x in enumerate(s) if x=='A']
listOfB=[idx for idx,x in enumerate(s) if x=='B']
However, I want to do it in only one iteration using list comprehensions only.
Is it possible to do it in a single statement?
something like listOfA,listOfB=[--code goes here--]
What you're trying to do isn't exactly impossible, it's just complicated, and probably wasteful.
If you want to partition an iterable into two iterables, if the source is a list or other re-usable iterable, you're probably better off either doing it in two passes, as in your question.
Even if the source is an iterator, if the output you want is a pair of lists, not a pair of lazy iterators, either use Martijn's answer, or do two passes over
list(iterator)
.)But if you really need to lazily partition an arbitrary iterable into two iterables, there's no way to do that without some kind of intermediate storage.
Let's say you partition
[1, 2, -1, 3, 4, -2]
intopositives
andnegatives
. Now you try tonext(negatives)
. That ought to give you-1
, right? But it can't do that without consuming the1
and the2
. Which means when you try tonext(positives)
, you're going to get3
instead of1
. So, the1
and2
need to get stored somewhere.Most of the cleverness you need is wrapped up inside
itertools.tee
. If you just makepositives
andnegatives
into two teed copies of the same iterator, then filter them both, you're done.In fact, this is one of the recipes in the
itertools
docs:(If you can't understand that, it's probably worth writing it out explicitly, with either two generator functions sharing an iterator and a tee via a closure, or two methods of a class sharing them via
self
. It should be a couple dozen lines of code that doesn't require anything tricky.)And you can even get
partition
as an import from a third-party library likemore_itertools
.Now, you can use this in a one-liner:
… and you've got an iterator over all the positive values, and an iterator over all of the negative values. They look like they're completely independent, but together they only do a single pass over
lst
—so it works even if you assignlst
to a generator expression or a file or something instead of a list.So, why isn't there some kind of shortcut syntax for this? Because it would be pretty misleading.
A comprehension takes no extra storage. That's the reason generator expressions are so great—they can transform a lazy iterator into another lazy iterator without storing anything.
But this takes
O(N)
storage. Imagine all of the numbers are positive, but you try to iteratenegative
first. What happens? All of the numbers get pushed totrueq
. In fact, thatO(N)
could even be infinite (e.g., try it onitertools.count()
).That's fine for something like
itertools.tee
, a function stuck in a module that most novices don't even know about, and which has nice docs that can explain what it does and make the costs clear. But doing it with syntactic sugar that made it look just like a normal comprehension would be a different story.The very definition of a list comprehension is to produce one list object. Your 2 list objects are of different lengths even; you'd have to use side-effects to achieve what you want.
Don't use list comprehensions here. Just use an ordinary loop:
This leaves you with just one loop to execute; this will beat any two list comprehensions, at least not until the developers find a way to make list comprehensions build a list twice as fast as a loop with separate
list.append()
calls.I'd pick this any day over a nested list comprehension just to be able to produce two lists on one line. As the Zen of Python states:
For those who live on the edge ;)
Sort of; the key is to generate a 2-element list that you can then unpack:
That said, I think it's pretty daft to do it that way, an explicit loop is much more readable.
A nice approach to this problem is to use defaultdict. As @Martin already said, list comprehension is not the right tool to produce two lists. Using defaultdict would enable you to create segregation using a single iteration. Moreover your code would not be limited in any form.