I'm very new to Rust, coming from C# / Java / similar.
In C# we have IEnumerable<T>
that can be used to iterate almost any kind of array or list. C# also has a yield
keyword that you can use to return a lazy list. Here's an example...
// Lazily returns the even numbers out of an enumerable
IEnumerable<int> Evens(IEnumerable<int> input)
{
foreach (var x in input)
{
if (x % 2 == 0)
{
yield return x;
}
}
}
This is a silly example of course. I know I could do this with Rust's map
function, but I would like to know how to create my own methods that accept and return generic iterators.
From what I can gather, Rust has generic iterators that can be use similarly, but they are above my understanding. I see Iter
, IntoIterator
, Iterator
types, and probably more in documentation, but no good way to understand them.
Can anyone provide clear examples of how to create something like above? Thank you!
P.S. The lazy aspect is optional. I am more concerned with abstraction away from specific list and array types.
Here is the full version of
Map
, and here is the function that builds it.A minimal implementation would look something like
Playpen link. Note that the
map
used inside the iterator is the method onOption
; this isn't recursively defined!It's not too convenient to write, but boy is it fast!
Now, to write this for an arbitrary "enumerable" type one would change
map
toIntoIterator
is basicallyIEnumerable
, only instead ofGetEnumerator
there'sinto_iter
.Implement the Iterator trait for the struct that should serve as iterator. You only need to implement the
next
method. The other methods have default implementations.It is not possible to create an iterator that works with any container. The type system machinery needed for this doesn't exist yet.
First, forget about
IntoIterator
and other traits or types. The core iteration trait in Rust isIterator
. Its trimmed down definition is as follows:As you probably know, you can think of an iterator as a cursor inside of some structure.
next()
method advances this cursor forward, returning an element it pointed at previously. Naturally, if the collection is exhausted, there is nothing to return, and sonext()
returnsOption<Self::Item>
, not justSelf::Item
.Iterator
is a trait, and so it can be implemented by specific types. Note thatIterator
itself is not a proper type which you can use as a return value or a function argument - you have to use concrete types which implement this trait.The above statement may sound too restrictive - how to use arbitrary iterator types then? - but because of generics this is not so. If you want a function to accept arbitrary iterators, just make it generic in the corresponding argument, adding an
Iterator
bound over the corresponding type parameter:Returning iterators from functions may be difficult, but see below.
For example, there is a method on
&[T]
, callediter()
, which returns an iterator which yields references into the slice. This iterator is an instance of this structure. You can see on that page howIterator
is implemented forIter
:This structure holds a reference to the original slice and some iteration state inside it. Its
next()
method updates this state and returns the next value, if there is any.Any value whose type implements
Iterator
can be used in afor
loop (for
loop in fact works withIntoIterator
, but see below):Now,
Iterator
trait is actually more complex than the above one. It also defines a lot of transformation methods which consume the iterator they are called on and return a new iterator which somehow transforms or filters values from the original iterator. For example,enumerate()
method returns an iterator which yields values from the original iterator together with the positional number of the element:enumerate()
is defined like this:Enumerate
is just a struct which contains an iterator and a counter inside it and which implementsIterator<Item=(usize, I::Item)>
:And this is how most iterator transformations are implemented: each transformation is a wrapping struct which wraps the original iterator and implements
Iterator
trait by delegating to the original iterator and transforming the resulting value somehow. For example,s.iter().enumerate()
from the example above returns a value of typeEnumerate<Iter<'static, u8>>
.Note that while
enumerate()
is defined inIterator
trait directly, it can be a standalone function as well:The method works very similarly - it just uses implicit
Self
type parameter instead of an explicitly named one.You may wonder what
IntoIterator
trait is. Well, it is just a convenience conversion trait which can be implemented by any type which can be converted to an iterator:For example,
&'a [T]
can be converted intoIter<'a, T>
, and so it has the following implementation:This trait is implemented for most container types and references to these types. It is in fact used by
for
loops - a value of any type which implementsIntoIterator
can be used inin
clause:This is very nice from learning and reading perspective because it has less noise (in form of
iter()
-like methods). It even allows things like these:This is possible because
IntoIterator
is implemented differently for&Vec<T>
,&mut Vec<T>
and justVec<T>
.Every
Iterator
implementsIntoIterator
which performs an identity conversion (into_iter()
just returns the iterator it is called on), so you can useIterator
instances infor
loops as well.Consequently, it makes sense to use
IntoIterator
in generic functions because it will make the API more convenient for the user. For example,enumerate()
function from above could be rewritten as such:Now you can see how generics can be used to implement transformations with static typing easily. Rust does not have anything like C#/Python
yield
(but it is one of the most desired features, so one day it may appear in the language!), thus you need to wrap source iterators explicitly. For example, you can write something analogous to the aboveEnumerate
structure which does the task you want.However, the most idiomatic way would be to use existing combinators to do the work for you. For example, your code may be written as follows:
However, using combinators may turn ugly when you want to write custom combinator functions because a lot of existing combinator functions accept closures (e.g. the
filter()
one above), but closures in Rust are implemented as values of anonymous types, so there is just no way to write the signature of the function returning the iterator out:There are several ways around this, one of them is using trait objects:
Here we hide the actual iterator type returned by
filter()
behind a trait object. Note that in order to make the function fully generic I had to add a lifetime parameter and a corresponding bound toBox
trait object andI::IntoIter
associated type. This is necessary becauseI::IntoIter
may contain arbitrary lifetimes inside it (just likeIter<'a, T>
type above), and we have to specify them in the trait object type (otherwise the lifetime information would be lost).Trait objects created from
Iterator
trait implementIterator
themselves, so you can continue using these iterators as usual: