This question already has an answer here:
-
How do you split a list into evenly sized chunks?
57 answers
I have a python list which runs into 1000's. Something like:
data=["I","am","a","python","programmer".....]
where, len(data)= say 1003
I would now like to create a subset of this list (data) by splitting the orginal list into chunks of 100. So, at the end, Id like to have something like:
data_chunk1=[.....] #first 100 items of list data
data_chunk2=[.....] #second 100 items of list data
.
.
.
data_chunk11=[.....] # remainder of the entries,& its len <=100, len(data_chunk_11)=3
Is there a pythonic way to achieve this task? Obviously I can use data[0:100] and so on, but I am assuming that is terribly non-pythonic and very inefficient.
Many thanks.
I'd say
chunks = [data[x:x+100] for x in xrange(0, len(data), 100)]
If you are using python 3.x range()
replaces python 2.x's xrange()
, changing the above code to:
chunks = [data[x:x+100] for x in range(0, len(data), 100)]
Actually I think using plain slices is the best solution in this case:
for i in range(0, len(data), 100):
chunk = data[i:i + 100]
...
If you want to avoid copying the slices, you could use itertools.islice()
, but it doesn't seem to be necessary here.
The itertools()
documentation also contains the famous "grouper" pattern:
def grouper(n, iterable, fillvalue=None):
"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
You would need to modify it to treat the last chunk correctly, so I think the straight-forward solution using plain slices is preferable.
chunks = [data[100*i:100*(i+1)] for i in range(len(data)/100 + 1)]
This is equivalent to the accepted answer. For example, shortening to batches of 10 for readability:
data = range(35)
print [data[x:x+10] for x in xrange(0, len(data), 10)]
print [data[10*i:10*(i+1)] for i in range(len(data)/10 + 1)]
Outputs:
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34]]
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34]]