I'm reading about the yield
keyword in python, and trying to understand running this sample:
def countfrom(n):
while True:
print "before yield"
yield n
n += 1
print "after yield"
for i in countfrom(10):
print "enter for loop"
if i <= 20:
print i
else:
break
The output is:
before yield
enter for loop
10
after yield
before yield
enter for loop
11
after yield
before yield
enter for loop
12
after yield
before yield
enter for loop
13
after yield
before yield
enter for loop
14
after yield
before yield
enter for loop
15
after yield
before yield
enter for loop
16
after yield
before yield
enter for loop
17
after yield
before yield
enter for loop
18
after yield
before yield
enter for loop
19
after yield
before yield
enter for loop
20
after yield
before yield
enter for loop
It looks like the yield will return the specified value, and will continue runnning the function till the end (in a parallel thread, maybe). Is my understand correct?
If you could answer this without mentioning "generators", I would be thankful, because I'm trying to understand one at a time.
You can think of it as if the function which yield
s simply "pauses" when it comes across the yield
. The next time you call it, it will resume after the yield
keeping the state that it was in when it left.
No, there is only a single thread.
Each iteration of the for loop runs your countFrom
function until it yields something, or returns. After the yield, the body of the for loop runs again and then, when a new iteration starts, the countFrom
function picks up exactly where it left off and runs again until it yields (or returns).
This modified version of your example will helpfully make it clearer what path execution takes.
def countfrom(n):
while n <= 12:
print "before yield, n = ", n
yield n
n += 1
print "after yield, n = ", n
for i in countfrom(10):
print "enter for loop, i = ", i
print i
print "end of for loop iteration, i = ", i
Output
before yield, n = 10
enter for loop, i = 10
10
end of for loop iteration, i = 10
after yield, n = 11
before yield, n = 11
enter for loop, i = 11
11
end of for loop iteration, i = 11
after yield, n = 12
before yield, n = 12
enter for loop, i = 12
12
end of for loop iteration, i = 12
after yield, n = 13
..you cannot explain the meaning of the yield
statement without mentioning generators; it would be like trying to explain what a stone is without mentioning rock. That is: the yield statement is the one responsible to transform a normal function into a generator.
While you find it well documented here: http://docs.python.org/reference/simple_stmts.html#the-yield-statement
..the brief explaination of it is:
- When a function using the yield statement is called, it returns a "generator iterator", having a
.next()
method (the standard for iterable objects)
- Each time the
.next()
method of the generator is called (eg. by iterating the object with a for loop), the function is called until the first yield is encountered. Then the function execution is paused and a value is passed as return value of the .next()
method.
- The next time
.next()
is called, the function execution is resumed until the next yield
, etc. until the function returns something.
Some advantages in doing this are:
- less memory usage since memory is allocated just for the currently yielded value, not the whole list of returned values (as it would be by returning a list of values)
- "realtime" results return, as they are produced can be passed to the caller without waiting for the generation end (i used that to return output from a running process)
The function countfrom
is not run in a parallel thread. What happens here is that whenever the for
-construct asks for the next value, the function will execute until it hits a yield
statement. When the next value after that is required, the function resumes execution from where it left off.
And while you asked not to mention "generators", they are so intimately linked with yield
that it doesn't really make sense to talk about the separately. What your countfrom
function actually returns is a "generator object". It returns this object immediately after it is called, so the function body is not executed at all until something (e.g. a for
-loop) requests values from the generator using its method .next()
.
the yield statement stores the value that you yield, until that function is called again.
so if you call that function (with an iterator) it will run the function another time and give you the value.
the point being that it knows where it left off last time
Python runs until it hits a yield
and then stops and freezes execution. It's not continuing to run. It's hitting "after" on the next call to countfrom
It's easy to say that without making reference to generators but the fact is yield and generator are inextricably linked. To really understand it you've got to view them as the same topic.
It's easy to show yourself that what I (and others) have said is true by working with the generator from your example in a more manual way.
A function that yield
s instead of return
ing really returns a generator. You can then consume that generator by calling next
. You are confused because your loop is taking care of all that in the background for you.
Here it is with the internals opened up:
def countfrom(n):
while n <= 12:
print "before yield, n = ", n
yield n
n += 1
print "after yield, n = ", n
your_generator = countfrom(10)
next(your_generator)
print "see the after yield hasn't shown up yet, it's stopped at the first yield"
next(your_generator)
print "now it woke back up and printed the after... and continued through the loop until it got to back to yield"
next(your_generator)
print "rinse and repeate"