What can you use Python generator functions for?

2019-01-01 08:07发布

I'm starting to learn Python and I've come across generator functions, those that have a yield statement in them. I want to know what types of problems that these functions are really good at solving.

16条回答
旧时光的记忆
2楼-- · 2019-01-01 08:08

My favorite uses are "filter" and "reduce" operations.

Let's say we're reading a file, and only want the lines which begin with "##".

def filter2sharps( aSequence ):
    for l in aSequence:
        if l.startswith("##"):
            yield l

We can then use the generator function in a proper loop

source= file( ... )
for line in filter2sharps( source.readlines() ):
    print line
source.close()

The reduce example is similar. Let's say we have a file where we need to locate blocks of <Location>...</Location> lines. [Not HTML tags, but lines that happen to look tag-like.]

def reduceLocation( aSequence ):
    keep= False
    block= None
    for line in aSequence:
        if line.startswith("</Location"):
            block.append( line )
            yield block
            block= None
            keep= False
        elif line.startsWith("<Location"):
            block= [ line ]
            keep= True
        elif keep:
            block.append( line )
        else:
            pass
    if block is not None:
        yield block # A partial block, icky

Again, we can use this generator in a proper for loop.

source = file( ... )
for b in reduceLocation( source.readlines() ):
    print b
source.close()

The idea is that a generator function allows us to filter or reduce a sequence, producing a another sequence one value at a time.

查看更多
大哥的爱人
3楼-- · 2019-01-01 08:16

Generators give you lazy evaluation. You use them by iterating over them, either explicitly with 'for' or implicitly by passing it to any function or construct that iterates. You can think of generators as returning multiple items, as if they return a list, but instead of returning them all at once they return them one-by-one, and the generator function is paused until the next item is requested.

Generators are good for calculating large sets of results (in particular calculations involving loops themselves) where you don't know if you are going to need all results, or where you don't want to allocate the memory for all results at the same time. Or for situations where the generator uses another generator, or consumes some other resource, and it's more convenient if that happened as late as possible.

Another use for generators (that is really the same) is to replace callbacks with iteration. In some situations you want a function to do a lot of work and occasionally report back to the caller. Traditionally you'd use a callback function for this. You pass this callback to the work-function and it would periodically call this callback. The generator approach is that the work-function (now a generator) knows nothing about the callback, and merely yields whenever it wants to report something. The caller, instead of writing a separate callback and passing that to the work-function, does all the reporting work in a little 'for' loop around the generator.

For example, say you wrote a 'filesystem search' program. You could perform the search in its entirety, collect the results and then display them one at a time. All of the results would have to be collected before you showed the first, and all of the results would be in memory at the same time. Or you could display the results while you find them, which would be more memory efficient and much friendlier towards the user. The latter could be done by passing the result-printing function to the filesystem-search function, or it could be done by just making the search function a generator and iterating over the result.

If you want to see an example of the latter two approaches, see os.path.walk() (the old filesystem-walking function with callback) and os.walk() (the new filesystem-walking generator.) Of course, if you really wanted to collect all results in a list, the generator approach is trivial to convert to the big-list approach:

big_list = list(the_generator)
查看更多
荒废的爱情
4楼-- · 2019-01-01 08:20

Buffering. When it is efficient to fetch data in large chunks, but process it in small chunks, then a generator might help:

def bufferedFetch():
  while True:
     buffer = getBigChunkOfData()
     # insert some code to break on 'end of data'
     for i in buffer:    
          yield i

The above lets you easily separate buffering from processing. The consumer function can now just get the values one by one without worrying about buffering.

查看更多
其实,你不懂
5楼-- · 2019-01-01 08:23

One of the reasons to use generator is to make the solution clearer for some kind of solutions.

The other is to treat results one at a time, avoiding building huge lists of results that you would process separated anyway.

If you have a fibonacci-up-to-n function like this:

# function version
def fibon(n):
    a = b = 1
    result = []
    for i in xrange(n):
        result.append(a)
        a, b = b, a + b
    return result

You can more easily write the function as this:

# generator version
def fibon(n):
    a = b = 1
    for i in xrange(n):
        yield a
        a, b = b, a + b

The function is clearer. And if you use the function like this:

for x in fibon(1000000):
    print x,

in this example, if using the generator version, the whole 1000000 item list won't be created at all, just one value at a time. That would not be the case when using the list version, where a list would be created first.

查看更多
还给你的自由
6楼-- · 2019-01-01 08:24

The simple explanation: Consider a for statement

for item in iterable:
   do_stuff()

A lot of the time, all the items in iterable doesn't need to be there from the start, but can be generated on the fly as they're required. This can be a lot more efficient in both

  • space (you never need to store all the items simultaneously) and
  • time (the iteration may finish before all the items are needed).

Other times, you don't even know all the items ahead of time. For example:

for command in user_input():
   do_stuff_with(command)

You have no way of knowing all the user's commands beforehand, but you can use a nice loop like this if you have a generator handing you commands:

def user_input():
    while True:
        wait_for_command()
        cmd = get_command()
        yield cmd

With generators you can also have iteration over infinite sequences, which is of course not possible when iterating over containers.

查看更多
伤终究还是伤i
7楼-- · 2019-01-01 08:25

Also good for printing the prime numbers up to n:

def genprime(n=10):
    for num in range(3, n+1):
        for factor in range(2, num):
            if num%factor == 0:
                break
        else:
            yield(num)

for prime_num in genprime(100):
    print(prime_num)
查看更多
登录 后发表回答