Can generators be used with string.format in pytho

2019-04-27 12:34发布

问题:

"{}, {}, {}".format(*(1,2,3,4,5))

Prints:

'1, 2, 3'

This works, as long as the number of {} in format does not exceed the length of a tuple. I want to make it work for a tuple of arbitrary length, padding it with -s if it is of insufficient length. And to avoid making assumptions about the number of {}'s, I wanted to use a generator. Here's what I had in mind:

def tup(*args):
    for s in itertools.chain(args, itertools.repeat('-')):
        yield s

print "{}, {}, {}".format(*tup(1,2))

Expected:

'1, 2, -'

But it never returns. Can you make it work with generators? Is there a better approach?

回答1:

If you think about it, besides the fact that variable argument unpacking unpacks all at once, there's also the fact that format doesn't necessarily take its arguments in order, as in '{2} {1} {0}'.

You could work around this if format just took a sequence instead of requiring separate arguments, by building a sequence that does the right thing. Here's a trivial example:

class DefaultList(list):
    def __getitem__(self, idx):
        try:
            return super(DefaultList, self).__getitem__(idx)
        except IndexError:
            return '-'

Of course your real-life version would wrap an arbitrary iterable, not subclass list, and would probably have to use tee or an internal cache and pull in new values as requested, only defaulting when you've passed the end. (You may want to search for "lazy list" or "lazy sequence" recipes at ActiveState, because there are a few of them that do this.) But this is enough to show the example.

Now, how does this help us? It doesn't; *lst on a DefaultList will just try to make a tuple out of the thing, giving us exactly the same number of arguments we already had. But what if you had a version of format that could just take a sequence of args instead? Then you could just pass your DefaultList and it would work.

And you do have that: Formatter.vformat.

>>> string.Formatter().vformat('{0} {1} {2}', DefaultList([0, 1]), {})
'0 1 -'

However, there's an even easier way, once you're using Formatter explicitly instead of implicitly via the str method. You can just override its get_value method and/or its check_unused_args:

class DefaultFormatter(string.Formatter):
    def __init__(self, default):
        self.default = default

    # Allow excess arguments
    def check_unused_args(self, used_args, args, kwargs):
        pass

    # Fill in missing arguments
    def get_value(self, key, args, kwargs):
        try:
            return super(DefaultFormatter, self).get_value(key, args, kwargs)
        except IndexError:
            return '-'

f = DefaultFormatter('-')

print(f.vformat('{0} {2}', [0], {}))
print(f.vformat('{0} {2}', [0, 1, 2, 3], {}))

Of course you're still going to need to wrap your iterator in something that provides the Sequence protocol.


While we're at it, your problem could be solved more directly if the language had an "iterable unpacking" protocol. See here for a python-ideas thread proposing such a thing, and all of the problems the idea has. (Also note that the format function would make this trickier, because it would have to use the unpacking protocol directly instead of relying on the interpreter to do it magically. But, assuming it did so, then you'd just need to write a very simple and general-purpose wrapper around any iterable that handles __unpack__ for it.)



回答2:

You cannot use endless generators to fill any *args arbitrary arguments call.

Python iterates over the generator to load all arguments to pass on to the callable, and if the generator is endless, that will never complete.

You can use non-endless generators without problems. You could use itertools.islice() to cap a generator:

from itertools import islice

print "{}, {}, {}".format(*islice(tup(1,2), 3))

After all, you already know how many slots your template has.



回答3:

Martijn Pieters has the immediate answer, but if you wanted to create some sort of generic wrapper/helper for format autofilling, you could look at string.Formatter.parse. Using that, you can get a representation of how format sees the format string, and strip out the argument count/named argument names to dynamically figure out how long your iterator needs to be.



回答4:

The naive approach would be to provide L/2 arguments to the format function where L is the length of the format string. Since a replacement token is at least 2 chars long, you are certain to always have enough values to unpack:

def tup(l, *args):
    for s in args + (('-',) * l):
        yield s   
s = "{}, {}, {}"
print s.format(*list(tup(len(s)//2, 1, 2)))

As suggested by Silas Ray a more refined upper bound can be found using string.Formatter.parse

import string
def tup(l, *args):
    for s in args + (('-',) * l):
        yield s   
s = "{}, {}, {}"
l = len(list(string.Formatter().parse(s)))
print s.format(*list(tup(l, 1, 2)))