s = [1,2,3,4,5,6,7,8,9]
n = 3
zip(*[iter(s)]*n) # returns [(1,2,3),(4,5,6),(7,8,9)]
How does zip(*[iter(s)]*n)
work? What would it look like if it was written with more verbose code?
s = [1,2,3,4,5,6,7,8,9]
n = 3
zip(*[iter(s)]*n) # returns [(1,2,3),(4,5,6),(7,8,9)]
How does zip(*[iter(s)]*n)
work? What would it look like if it was written with more verbose code?
iter()
is an iterator over a sequence. [x] * n
produces a list containing n
quantity of x
, i.e. a list of length n
, where each element is x
. *arg
unpacks a sequence into arguments for a function call. Therefore you\'re passing the same iterator 3 times to zip()
, and it pulls an item from the iterator each time.
x = iter([1,2,3,4,5,6,7,8,9])
print zip(x, x, x)
The other great answers and comments explain well the roles of argument unpacking and zip().
As Ignacio and ujukatzel say, you pass to zip()
three references to the same iterator and zip()
makes 3-tuples of the integers—in order—from each reference to the iterator:
1,2,3,4,5,6,7,8,9 1,2,3,4,5,6,7,8,9 1,2,3,4,5,6,7,8,9
^ ^ ^
^ ^ ^
^ ^ ^
And since you ask for a more verbose code sample:
chunk_size = 3
L = [1,2,3,4,5,6,7,8,9]
# iterate over L in steps of 3
for start in range(0,len(L),chunk_size): # xrange() in 2.x; range() in 3.x
end = start + chunk_size
print L[start:end] # three-item chunks
Following the values of start
and end
:
[0:3) #[1,2,3]
[3:6) #[4,5,6]
[6:9) #[7,8,9]
FWIW, you can get the same result with map()
with an initial argument of None
:
>>> map(None,*[iter(s)]*3)
[(1, 2, 3), (4, 5, 6), (7, 8, 9)]
For more on zip()
and map()
: http://muffinresearch.co.uk/archives/2007/10/16/python-transposing-lists-with-map-and-zip/
I think one thing that\'s missed in all the answers (probably obvious to those familiar with iterators) but not so obvious to others is -
Since we have the same iterator, it gets consumed and the remaining elements are used by the zip. So if we simply used the list and not the iter eg.
l = range(9)
zip(*([l]*3)) # note: not an iter here, the lists are not emptied as we iterate
# output
[(0, 0, 0), (1, 1, 1), (2, 2, 2), (3, 3, 3), (4, 4, 4), (5, 5, 5), (6, 6, 6), (7, 7, 7), (8, 8, 8)]
Using iterator, pops the values and only keeps remaining available, so for zip once 0 is consumed 1 is available and then 2 and so on. A very subtle thing, but quite clever!!!
iter(s)
returns an iterator for s.
[iter(s)]*n
makes a list of n times the same iterator for s.
So, when doing zip(*[iter(s)]*n)
, it extracts an item from all the three iterators from the list in order. Since all the iterators are the same object, it just groups the list in chunks of n
.
One word of advice for using zip this way. It will truncate your list if it\'s length is not evenly divisible. To work around this you could either use itertools.izip_longest if you can accept fill values. Or you could use something like this:
def n_split(iterable, n):
num_extra = len(iterable) % n
zipped = zip(*[iter(iterable)] * n)
return zipped if not num_extra else zipped + [iterable[-num_extra:], ]
Usage:
for ints in n_split(range(1,12), 3):
print \', \'.join([str(i) for i in ints])
Prints:
1, 2, 3
4, 5, 6
7, 8, 9
10, 11
It is probably easier to see what is happening in python interpreter or ipython
with n = 2
:
In [35]: [iter(\"ABCDEFGH\")]*2
Out[35]: [<iterator at 0x6be4128>, <iterator at 0x6be4128>]
So, we have a list of two iterators which are pointing to the same iterator object. Remember that iter
on a object returns an iterator object and in this scenario, it is the same iterator twice due to the *2
python syntactic sugar. Iterators also run only once.
Further, zip
takes any number of iterables (sequences are iterables) and creates tuple from i\'th element of each of the input sequences. Since both iterators are identical in our case, zip moves the same iterator twice for each 2-element tuple of output.
In [41]: help(zip)
Help on built-in function zip in module __builtin__:
zip(...)
zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]
Return a list of tuples, where each tuple contains the i-th element
from each of the argument sequences. The returned list is truncated
in length to the length of the shortest argument sequence.
The unpacking (*
) operator ensures that the iterators run to exhaustion which in this case is until there is not enough input to create a 2-element tuple.
This can be extended to any value of n
and zip(*[iter(s)]*n)
works as described.