I would like to split the output of an object that implements Iterator<(A,B)>
into two objects that implement Iterator<A>
and Iterator<B>
. Since one of the outputs could be iterated more than the other, I'll need to buffer up the output of the Iterator<(A,B)>
(because I can't rely on the Iterator<(A,B)>
being cloneable.) The problem is that the iterator could be infinite, so I can't simply collect the output of the iterator into two buffers and return iterators over the two buffers.
So it seems that I'll need to hold buffers of the A
and B
objects, and whenever one of the buffers is empty I'll fill it with samples from the Iterator<(A,B)>
object. This means that I'll need two iterable structs that have mutable references to the input iterator (since both of them will need to call next()
on the input to fill up the buffers), which is impossible.
So, is there any way to accomplish this in a safe way?
This is possible. As you identified you need mutable references to the base iterator from both handles, which is possible using a type with "internal mutability", that is, one that uses
unsafe
code internally to expose a safe API for acquiring a&mut
to aliasable data (i.e. contained in a&
) by dynamically enforcing the invariants that the compiler normally enforces at compile time outsideunsafe
.I'm assuming you're happy to keep the two iterators on a single thread1, so, in this case, we want a
RefCell
. We also need to be able to have access to theRefCell
from the two handles, entailing storing either a&RefCell<...>
or anRc<RefCell<...>>
. The former would be too restrictive, as it would only allow us to use the pair of iterators in and below the stack frame in which theRefCell
is created, while we want to be able to freely pass the iterators around, soRc
it is.In summary, we're basically going to be storing an
Rc<RefCell<Iterator<(A,B)>>>
, there's just the question of buffering. The right tool for the job here is aRingBuf
since we want efficient push/pop at the front and back. Thus, the thing we're sharing (i.e. inside theRefCell
) could look like:We can abbreviate the type actually being shared as
type Shared<A, B, It> = Rc<RefCell<SharedInner<A, B, It>>>;
, which allows us to define the iterators:To implement
next
the first thing to do is get a&mut
to theSharedInner
, viaself.data.borrow_mut();
. And then get an element out of it: check the right buffer, or otherwise get a new element fromiter
(remembering to buffer the left-overB
):Docs:
RingBuf.pop_front
,Option.or_else
.The iterator for the other side is similar. In total:
which prints
playpen.
There's various ways this could be optimised, e.g. when fetching in
First
, only buffer the left-overB
if aSecond
handle exists.1 If you were looking to run them in separate threads just replace the
RefCell
with aMutex
and theRc
with anArc
, and add the necessary bounds.