Generator to yield gap tuples from zipped iterable

2019-05-18 08:19发布

问题:

Let's say that I have an arbitrary number of iterables, all of which can be assumed to be sorted, and contain elements all of the same type (integers, for illustration's sake).

a = (1, 2, 3, 4, 5)
b = (2, 4, 5)
c = (1, 2, 3, 5)

I would like to write a generator function yielding the following:

(1, None, 1)
(2, 2, 2)
(3, None, 3)
(4, 4, None)
(5, 5, 5)

In other words, progressively yield sorted tuples with gaps where elements are missing from the input iterables.

回答1:

My take on this, using only iterators, not heaps:

a = (1, 2, 4, 5)
b = (2, 5)
c = (1, 2, 6)
d = (1,)
inputs = [iter(x) for x in (a, b, c, d)]

def minwithreplacement(currents, inputs, minitem, done):
    for i in xrange(len(currents)):
        if currents[i] == minitem:
            try:
                currents[i] = inputs[i].next()
            except StopIteration:
                currents[i] = None
                done[0] += 1
            yield minitem
        else:
            yield None

def dothing(inputs):
    currents = [it.next() for it in inputs]
    done = [0]
    while done[0] != len(currents):
        yield minwithreplacement(currents, inputs, min(x for x in currents if x), done)

print [list(x) for x in dothing(inputs)] #Consuming iterators for display purposes
>>>[[1, None, 1, 1], [2, 2, 2, None], [4, None, None, None], [5, 5, None, None], [None, None, 6, None]]


回答2:

We first need a variation of heapq.merge which also yields the index. You can get that by copy-pasting heapq.merge, and replacing each yield v with yield itnum, v. (I omit that part from my answer for readability).

Now we can do:

from collections import deque, OrderedDict

def f(*iterables):
    pending = OrderedDict()
    for i, v in merge(iterables):
        if (not pending) or pending.keys()[-1] < v:
            # a new greatest value
            pending[v] = [None] * len(iterables)
        pending[v][i] = v
        # yield all values smaller than v
        while len(pending) > 1 and pending.keys()[0] < v:
            yield pending.pop(pending.keys()[0])
    # yield remaining
    while pending:
        yield pending.pop(pending.keys()[0])

print list(f((1,2,3,4,5), (2,4,5), (1,2,3,5)))
=> [[1, None, 1], [2, 2, 2], [3, None, 3], [4, 4, None], [5, 5, 5]]