What's the most efficient way to reuse an iter

2019-06-23 23:29发布

问题:

I'd like to reuse an iterator I made, so as to avoid paying to recreate it from scratch. But iterators don't seem to be cloneable and collect moves the iterator so I can't reuse it.

Here's more or less the equivalent of what I'm trying to do.

let my_iter = my_string.unwrap_or("A").chars().flat_map(|c|c.to_uppercase()).map(|c| Tag::from(c).unwrap() );
let my_struct = {
  one: my_iter.collect(),
  two: my_iter.map(|c|{(c,Vec::new())}).collect(),
  three: my_iter.filter_map(|c|if c.predicate(){Some(c)}else{None}).collect(),
  four: my_iter.map(|c|{(c,1.0/my_float)}).collect(),
  five: my_iter.map(|c|(c,arg_time.unwrap_or(time::now()))).collect(),
  //etc...
}

回答1:

Iterators in general are Clone-able if all their "pieces" are Clone-able. You have a couple of them in my_iter that are not: the anonymous closures (like the one in flat_map) and the ToUppercase struct returned by to_uppercase.

What you can do is:

  1. rebuild the whole thing (as @ArtemGr suggests). You could use a macro to avoid repetition. A bit ugly but should work.
  2. collect my_iter into a Vec before populating my_struct (since you seem to collect it anyway in there): let my_iter: Vec<char> = my_string.unwrap_or("A").chars().flat_map(|c|c.to_uppercase()).map(|c| Tag::from(c).unwrap() ).collect();
  3. create your own custom iterator. Without your definitions of my_string (since you call unwrap_or on it I assume it's not a String) and Tag it's hard to help you more concretely with this.


回答2:

You should profile before you optimize something, otherwise you might end making things both slower and more complex than they need to.

The iterators in your example

let my_iter = my_string.unwrap_or("A").chars().flat_map(|c|c.to_uppercase()).map(|c| Tag::from(c).unwrap() );

are thin structures allocated on the stack. Cloning them isn't going to be much cheaper than building them from scratch.

Constructing an iterator with .chars().flat_map(|c| c.to_uppercase()) takes only a single nanosecond when I benchmark it.

According to the same benchmark, wrapping iterator creation in a closure takes more time than simply building the iterator in-place.

Cloning a Vec iterator is not much faster than building it in-place, both are practically instant.

test construction_only    ... bench:           1 ns/iter (+/- 0)
test inplace_construction ... bench:         249 ns/iter (+/- 20)
test closure              ... bench:         282 ns/iter (+/- 18)
test vec_inplace_iter     ... bench:           0 ns/iter (+/- 0)
test vec_clone_iter       ... bench:           0 ns/iter (+/- 0)


回答3:

You may use closure to get identical iterators:

#[derive(Debug)]
struct MyStruct{
    one:Vec<char>,
    two:Vec<char>,
    three:String
}

fn main() {
    let my_string:String = "ABCD1234absd".into();
    let my_iter = || my_string.chars();
    let my_struct = MyStruct{
        one: my_iter().collect(),
        two: my_iter().filter(|x| x.is_numeric()).collect(),
        three: my_iter().filter(|x| x.is_lowercase()).collect()
    };
    println!("{:?}", my_struct);
}

See also this Correct way to return an Iterator? question.

Also you may clone iterator (see @Paolo Falabella answer about iterators cloneability):

fn main() {
    let v = vec![1,2,3,4,5,6,7,8,9]; 
    let mut i = v.iter().skip(2);
    let mut j = i.clone();
    println!("{:?}", i.take(3).collect::<Vec<_>>());
    println!("{:?}", j.filter(|&x| x%2==0).collect::<Vec<_>>());
}

Unfortunately I can't tell which way is more effective