Breadth-first version of itertools.chain()

2020-05-01 02:41发布

问题:

In itertools there's chain, which combines multiple generators in a single one, and in essence does a depth-first iteration over them, i.e., chain.from_iterable(['ABC', '123']) yields A, B, C, 1, 2, 3. But, there's no breadth-first version, or am I missing something? There's of course izip_longest, but for large numbers of generators this feels awkward, as the tuples will be very long and possibly very sparse.

I came up with the following:

def chain_bfs(*generators):
    generators = list(generators)
    while generators:
        g = generators.pop(0)
        try:
            yield g.next()
        except StopIteration:
            pass
        else:
            generators.append(g)

It feels a bit verbose to me, is there a more Pythonic approach I'm missing? And would this function be a good candidate for inclusion in itertools?

回答1:

You could use collections.deque() to rotate through your iterators; rotating a deque is much more efficient. I'd also call it a chained zip, not a 'breath first chain', as such:

from collections import deque

def chained_zip(*iterables):
    iterables = deque(map(iter, iterables))
    while iterables:
        try:
            yield next(iterables[0])
        except StopIteration:
            iterables.popleft()
        else:
            iterables.rotate(-1)

Demo:

>>> list(chained_zip('ABC', '123'))
['A', '1', 'B', '2', 'C', '3']
>>> list(chained_zip('AB', '1234'))
['A', '1', 'B', '2', '3', '4']

There is also a roundrobin() recipe in the documentation that does the same, using the itertools.cycle() function:

def roundrobin(*iterables):
    "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
    # Recipe credited to George Sakkis
    pending = len(iterables)
    nexts = cycle(iter(it).__next__ for it in iterables)
    while pending:
        try:
            for next in nexts:
                yield next()
        except StopIteration:
            pending -= 1
            nexts = cycle(islice(nexts, pending))


回答2:

Not sure if you'd still consider this too "verbose"...

def chain_bfs2(*generators):
    generators = map(iter, generators)
    while generators:
        for i, generator in enumerate(generators):
            try:
                yield generator.next()
            except StopIteration:
                del generators[i]

print list(chain_bfs2('AB', '123'))  # ['A', '1', 'B', '2', '3']