Generator expressions is an extremely useful tool, and has a huge advantage over list comprehensions, which is the fact that it does not allocate memory for a new array.
The problem I am facing with generator expressions, which eventually makes me end up writing list comprehensions, is that I can only use a such a generator once:
>>> names = ['John', 'George', 'Paul', 'Ringo']
>>> has_o = (name for name in names if 'o' in name)
>>> for name in has_o:
... print(name.upper())
...
JOHN
GEORGE
RINGO
>>> for name in has_o:
... print(name.lower())
...
>>>
The above code illustrates how the generator expression can only be used once. That's of course, because the generator expression returns an instance of the generator, rather than defining a generator function which could be instantiated again and again.
Is there a way to clone the generator each time it is used, in order to make it reusable, or to make the generator expressions syntax return a generator function rather than a single instance?
Make it a lambda
:
has_o = lambda names: (name for name in names if 'o' in name)
for name in has_o(["hello","rrrrr"]):
print(name.upper())
for name in has_o(["hello","rrrrr"]):
print(name.upper())
lambda
is a one-liner and returns a new generator each time. Here I chose to be able to pass the input list, but if it's fixed, you don't even need a parameter:
names = ["hello","rrrrr"]
has_o = lambda: (name for name in names if 'o' in name)
for name in has_o():
print(name.upper())
for name in has_o():
print(name.upper())
In that last case, be careful about the fact that if names
changes or is reassigned, the lambda
uses the new names
object. You can fix the name reassigning by using the default value trick:
has_o = lambda lst=names: (name for name in lst if 'o' in name)
and you can fix the afterwards modification of names
by using the default value-and-copy trick (not super-useful when you think your first goal was to avoid a list to be created :)):
has_o = lambda lst=names[:]: (name for name in lst if 'o' in name)
(now make your pick :))
itertools.tee
allows you to make several iterators out of one iterable:
from itertools import tee
names = ['John', 'George', 'Paul', 'Ringo']
has_o_1, has_o_2 = tee((name for name in names if 'o' in name), 2)
print('iterable 1')
for name in has_o_1:
print(name.upper())
print('iterable 2')
for name in has_o_2:
print(name.upper())
Output:
iterable 1
JOHN
GEORGE
RINGO
iterable 2
JOHN
GEORGE
RINGO