What's the most Pythonic efficient way to iterate over a list in sliding pairs? Here's a related example:
>>> l
['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> for x, y in itertools.izip(l, l[1::2]): print x, y
...
a b
b d
c f
this is iteration in pairs, but how can we get iteration over a sliding pair? Meaning iteration over the pairs:
a b
b c
c d
d e
etc.
which is iteration over the pairs, except sliding the pair by 1 element each time rather than by 2 elements. thanks.
How about:
for x, y in itertools.izip(l, l[1:]): print x, y
You can go even simpler. Just zip the list and the list offset by one.
In [4]: zip(l, l[1:])
Out[4]: [('a', 'b'), ('b', 'c'), ('c', 'd'), ('d', 'e'), ('e', 'f'), ('f', 'g')]
Here is a little generator that I wrote a while back for a similar scenario:
def pairs(items):
items_iter = iter(items)
prev = next(items_iter)
for item in items_iter:
yield prev, item
prev = item
Here's a function for arbitrarily sized sliding windows that works for iterators/generators as well as lists
def sliding(seq, n):
return izip(*starmap(islice, izip(tee(seq, n), count(0), repeat(None))))
Nathan's solution is probably more efficient though.
The timing, as defined by the addition of two subsequent entries in the list, is displayed below and ordered from fastest to slowest.
Gil
In [69]: timeit.repeat("for x,y in itertools.izip(l, l[1::1]): x + y", setup=setup, number=1000)
Out[69]: [1.029047966003418, 0.996290922164917, 0.998831033706665]
Geoff Reedy
In [70]: timeit.repeat("for x,y in sliding(l,2): x+y", setup=setup, number=1000)
Out[70]: [1.2408790588378906, 1.2099130153656006, 1.207326889038086]
Alestanis
In [66]: timeit.repeat("for i in range(0, len(l)-1): l[i] + l[i+1]", setup=setup, number=1000)
Out[66]: [1.3387370109558105, 1.3243639469146729, 1.3245630264282227]
chmullig
In [68]: timeit.repeat("for x,y in zip(l, l[1:]): x+y", setup=setup, number=1000)
Out[68]: [1.4756009578704834, 1.4369518756866455, 1.5067830085754395]
Nathan Villaescusa
In [63]: timeit.repeat("for x,y in pairs(l): x+y", setup=setup, number=1000)
Out[63]: [2.254757881164551, 2.3750967979431152, 2.302199125289917]
sr2222
Notice the reduced repetition number...
In [60]: timeit.repeat("for x,y in SubsequenceIter(l,2): x+y", setup=setup, number=100)
Out[60]: [1.599524974822998, 1.5634570121765137, 1.608154058456421]
The setup code:
setup="""
from itertools import izip, starmap, islice, tee, count, repeat
l = range(10000)
def sliding(seq, n):
return izip(*starmap(islice, izip(tee(seq, n), count(0), repeat(None))))
class SubsequenceIter(object):
def __init__(self, iterable, subsequence_length):
self.iterator = iter(iterable)
self.subsequence_length = subsequence_length
self.subsequence = [0]
def __iter__(self):
return self
def next(self):
self.subsequence.pop(0)
while len(self.subsequence) < self.subsequence_length:
self.subsequence.append(self.iterator.next())
return self.subsequence
def pairs(items):
items_iter = iter(items)
prev = items_iter.next()
for item in items_iter:
yield (prev, item)
prev = item
"""
Not exactly the most efficient, but quite flexible:
class SubsequenceIter(object):
def __init__(self, iterable, subsequence_length):
self.iterator = iter(iterable)
self.subsequence_length = subsequence_length
self.subsequence = [0]
def __iter__(self):
return self
def next(self):
self.subsequence.pop(0)
while len(self.subsequence) < self.subsequence_length:
self.subsequence.append(self.iterator.next())
return self.subsequence
Usage:
for x, y in SubsequenceIter(l, 2):
print x, y