Possible Duplicate:
How do you split a list into evenly sized chunks in Python?
I am surprised I could not find a "batch" function that would take as input an iterable and return an iterable of iterables.
For example:
for i in batch(range(0,10), 1): print i
[0]
[1]
...
[9]
or:
for i in batch(range(0,10), 3): print i
[0,1,2]
[3,4,5]
[6,7,8]
[9]
Now, I wrote what I thought was a pretty simple generator:
def batch(iterable, n = 1):
current_batch = []
for item in iterable:
current_batch.append(item)
if len(current_batch) == n:
yield current_batch
current_batch = []
if current_batch:
yield current_batch
But the above does not give me what I would have expected:
for x in batch(range(0,10),3): print x
[0]
[0, 1]
[0, 1, 2]
[3]
[3, 4]
[3, 4, 5]
[6]
[6, 7]
[6, 7, 8]
[9]
So, I have missed something and this probably shows my complete lack of understanding of python generators. Anyone would care to point me in the right direction ?
[Edit: I eventually realized that the above behavior happens only when I run this within ipython rather than python itself]
This is probably more efficient (faster)
def batch(iterable, n=1):
l = len(iterable)
for ndx in range(0, l, n):
yield iterable[ndx:min(ndx + n, l)]
for x in batch(range(0, 10), 3):
print x
It avoids building new lists.
FWIW, the recipes in the itertools module provides this example:
def grouper(n, iterable, fillvalue=None):
"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
It works like this:
>>> list(grouper(3, range(10)))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, None, None)]
As others have noted, the code you have given does exactly what you want. For another approach using itertools.islice
you could see an example of following recipe:
from itertools import islice, chain
def batch(iterable, size):
sourceiter = iter(iterable)
while True:
batchiter = islice(sourceiter, size)
yield chain([batchiter.next()], batchiter)
Weird, seems to work fine for me in Python 2.x
>>> def batch(iterable, n = 1):
... current_batch = []
... for item in iterable:
... current_batch.append(item)
... if len(current_batch) == n:
... yield current_batch
... current_batch = []
... if current_batch:
... yield current_batch
...
>>> for x in batch(range(0, 10), 3):
... print x
...
[0, 1, 2]
[3, 4, 5]
[6, 7, 8]
[9]