Let's say that I have an arbitrary number of iterables, all of which can be assumed to be sorted, and contain elements all of the same type (integers, for illustration's sake).
a = (1, 2, 3, 4, 5)
b = (2, 4, 5)
c = (1, 2, 3, 5)
I would like to write a generator function yielding the following:
(1, None, 1)
(2, 2, 2)
(3, None, 3)
(4, 4, None)
(5, 5, 5)
In other words, progressively yield sorted tuples with gaps where elements are missing from the input iterables.
My take on this, using only iterators, not heaps:
a = (1, 2, 4, 5)
b = (2, 5)
c = (1, 2, 6)
d = (1,)
inputs = [iter(x) for x in (a, b, c, d)]
def minwithreplacement(currents, inputs, minitem, done):
for i in xrange(len(currents)):
if currents[i] == minitem:
try:
currents[i] = inputs[i].next()
except StopIteration:
currents[i] = None
done[0] += 1
yield minitem
else:
yield None
def dothing(inputs):
currents = [it.next() for it in inputs]
done = [0]
while done[0] != len(currents):
yield minwithreplacement(currents, inputs, min(x for x in currents if x), done)
print [list(x) for x in dothing(inputs)] #Consuming iterators for display purposes
>>>[[1, None, 1, 1], [2, 2, 2, None], [4, None, None, None], [5, 5, None, None], [None, None, 6, None]]
We first need a variation of heapq.merge
which also yields the index. You can get that by copy-pasting heapq.merge
, and replacing each yield v
with yield itnum, v
. (I omit that part from my answer for readability).
Now we can do:
from collections import deque, OrderedDict
def f(*iterables):
pending = OrderedDict()
for i, v in merge(iterables):
if (not pending) or pending.keys()[-1] < v:
# a new greatest value
pending[v] = [None] * len(iterables)
pending[v][i] = v
# yield all values smaller than v
while len(pending) > 1 and pending.keys()[0] < v:
yield pending.pop(pending.keys()[0])
# yield remaining
while pending:
yield pending.pop(pending.keys()[0])
print list(f((1,2,3,4,5), (2,4,5), (1,2,3,5)))
=> [[1, None, 1], [2, 2, 2], [3, None, 3], [4, 4, None], [5, 5, 5]]