This is the observed behavior:
In [4]: x = itertools.groupby(range(10), lambda x: True)
In [5]: y = next(x)
In [6]: next(x)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-6-5e4e57af3a97> in <module>()
----> 1 next(x)
StopIteration:
In [7]: y
Out[7]: (True, <itertools._grouper at 0x10a672e80>)
In [8]: list(y[1])
Out[8]: [9]
The expected output of list(y[1])
is [0,1,2,3,4,5,6,7,8,9]
What's going on here?
I observed this on cpython 3.4.2
, but others have seen this with cpython 3.5
and IronPython 2.9.9a0 (2.9.0.0) on Mono 4.0.30319.17020 (64-bit)
.
The observed behavior on Jython 2.7.0
and pypy:
Python 2.7.10 (5f8302b8bf9f, Nov 18 2015, 10:46:46)
[PyPy 4.0.1 with GCC 4.8.4]
>>>> x = itertools.groupby(range(10), lambda x: True)
>>>> y = next(x)
>>>> next(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>> y
(True, <itertools._groupby object at 0x00007fb1096039a0>)
>>>> list(y[1])
[]
The problem is that you group all of them into one group so after the first
next
call everything is already grouped:but the
elements
are a generator, so you need to pass it immediatly into some structure taking an iterable to "print" or "save" it, i.e. alist
:and then your
range(10)
is empty and the groupy-generator is finished:itertools.groupby
documentation tells thatSo the assumption from the last paragraph is that that the generated list would be the empty list
[]
, since the iterator advanced already, and metStopIteration
; but instead in CPython the result is surprising[9]
.This is because the
_grouper
iterator lags one item behind the original iterator, which is becausegroupby
needs to peek one item ahead to see if it belongs to the current or the next group, yet it must be able to later yield this item as the first item of the new group.However the
currkey
andcurrvalue
attributes of thegroupby
are not reset when the original iterator is exhausted, socurrvalue
still points to the last item from the iterator.The CPython documentation actually contains this equivalent code, that also has the exact same behaviour as the C version code:
Notably the
__next__
finds the first item of the next group, and stores it its key intoself.currkey
and its value toself.currvalue
. But the key is the lineWhen
next
throwsStopItertion
theself.currvalue
still contains the last key of the previous group. Now, wheny[1]
is made into alist
, it first yields the value ofself.currvalue
, and only then runsnext()
on the underlying iterator (and meetsStopIteration
again).Even though there is Python equivalent in the documentation, that behaves exactly like the authoritative C code implementation in CPython, IronPython, Jython and PyPy give different results.