How can I move a captured variable into a closure

2019-01-18 19:46发布

This code is an inefficient way of producing a unique set of items from an iterator. To accomplish this, I am attempting to use a Vec to keep track of values I've seen. I believe that this Vec needs to be owned by the innermost closure:

fn main() {
    let mut seen = vec![];
    let items = vec![vec![1i32, 2], vec![3], vec![1]];

    let a: Vec<_> = items
        .iter()
        .flat_map(move |inner_numbers| {
            inner_numbers.iter().filter_map(move |&number| {
                if !seen.contains(&number) {
                    seen.push(number);
                    Some(number)
                } else {
                    None
                }
            })
        })
        .collect();

    println!("{:?}", a);
}

However, compilation fails with:

error[E0507]: cannot move out of captured outer variable in an `FnMut` closure
 --> src/main.rs:8:45
  |
2 |     let mut seen = vec![];
  |         -------- captured outer variable
...
8 |             inner_numbers.iter().filter_map(move |&number| {
  |                                             ^^^^^^^^^^^^^^ cannot move out of captured outer variable in an `FnMut` closure

标签: closures rust
2条回答
Emotional °昔
2楼-- · 2019-01-18 20:13

It seems that the borrow checker is getting confused at the nested closures + mutable borrow. It might be worth filing an issue. Edit: See huon's answer for why this isn't a bug.

As a workaround, it's possible to resort to RefCell here:

use std::cell::RefCell;

fn main() {
    let seen = vec![];
    let items = vec![vec![1i32, 2], vec![3], vec![1]];

    let seen_cell = RefCell::new(seen);

    let a: Vec<_> = items
        .iter()
        .flat_map(|inner_numbers| {
            inner_numbers.iter().filter_map(|&number| {
                let mut borrowed = seen_cell.borrow_mut();

                if !borrowed.contains(&number) {
                    borrowed.push(number);
                    Some(number)
                } else {
                    None
                }
            })
        })
        .collect();

    println!("{:?}", a);
}
查看更多
爷、活的狠高调
3楼-- · 2019-01-18 20:24

This is a little surprising, but isn't a bug.

flat_map takes a FnMut as it needs to call the closure multiple times. The code with move on the inner closure fails because that closure is created multiple times, once for each inner_numbers. If I write the closures in explicit form (i.e. a struct that stores the captures and an implementation of one of the closure traits) your code looks (a bit) like

struct OuterClosure {
    seen: Vec<i32>
}
struct InnerClosure {
    seen: Vec<i32>
}
impl FnMut(&Vec<i32>) -> iter::FilterMap<..., InnerClosure> for OuterClosure {
    fn call_mut(&mut self, (inner_numbers,): &Vec<i32>) -> iter::FilterMap<..., InnerClosure> {
        let inner = InnerClosure {
            seen: self.seen // uh oh! a move out of a &mut pointer
        };
        inner_numbers.iter().filter_map(inner)
    }
}
impl FnMut(&i32) -> Option<i32> for InnerClosure { ... }

Which makes the illegality clearer: attempting to move out of the &mut OuterClosure variable.


Theoretically, just capturing a mutable reference is sufficient, since the seen is only being modified (not moved) inside the closure. However things are too lazy for this to work...

error: lifetime of `seen` is too short to guarantee its contents can be safely reborrowed
 --> src/main.rs:9:45
  |
9 |             inner_numbers.iter().filter_map(|&number| {
  |                                             ^^^^^^^^^
  |
note: `seen` would have to be valid for the method call at 7:20...
 --> src/main.rs:7:21
  |
7 |       let a: Vec<_> = items.iter()
  |  _____________________^
8 | |         .flat_map(|inner_numbers| {
9 | |             inner_numbers.iter().filter_map(|&number| {
10| |                 if !seen.contains(&number) {
... |
17| |         })
18| |         .collect();
  | |__________________^
note: ...but `seen` is only valid for the lifetime  as defined on the body at 8:34
 --> src/main.rs:8:35
  |
8 |           .flat_map(|inner_numbers| {
  |  ___________________________________^
9 | |             inner_numbers.iter().filter_map(|&number| {
10| |                 if !seen.contains(&number) {
11| |                     seen.push(number);
... |
16| |             })
17| |         })
  | |_________^

Removing the moves makes the closure captures work like

struct OuterClosure<'a> {
    seen: &'a mut Vec<i32>
}
struct InnerClosure<'a> {
    seen: &'a mut Vec<i32>
}
impl<'a> FnMut(&Vec<i32>) -> iter::FilterMap<..., InnerClosure<??>> for OuterClosure<'a> {
    fn call_mut<'b>(&'b mut self, inner_numbers: &Vec<i32>) -> iter::FilterMap<..., InnerClosure<??>> {
        let inner = InnerClosure {
            seen: &mut *self.seen // can't move out, so must be a reborrow
        };
        inner_numbers.iter().filter_map(inner)
    }
}
impl<'a> FnMut(&i32) -> Option<i32> for InnerClosure<'a> { ... }

(I've named the &mut self lifetime in this one, for pedagogical purposes.)

This case is definitely more subtle. The FilterMap iterator stores the closure internally, meaning any references in the closure value (that is, any references it captures) have to be valid as long as the FilterMap values are being thrown around, and, for &mut references, any references have to be careful to be non-aliased.

The compiler can't be sure flat_map won't, e.g. store all the returned iterators in a Vec<FilterMap<...>> which would result in a pile of aliased &muts... very bad! I think this specific use of flat_map happens to be safe, but I'm not sure it is in general, and there's certainly functions with the same style of signature as flat_map (e.g. map) would definitely be unsafe. (In fact, replacing flat_map with map in the code gives the Vec situation I just described.)

For the error message: self is effectively (ignoring the struct wrapper) &'b mut (&'a mut Vec<i32>) where 'b is the lifetime of &mut self reference and 'a is the lifetime of the reference in the struct. Moving the inner &mut out is illegal: can't move an affine type like &mut out of a reference (it would work with &Vec<i32>, though), so the only choice is to reborrow. A reborrow is going through the outer reference and so cannot outlive it, that is, the &mut *self.seen reborrow is a &'b mut Vec<i32>, not a &'a mut Vec<i32>.

This makes the inner closure have type InnerClosure<'b>, and hence the call_mut method is trying to return a FilterMap<..., InnerClosure<'b>>. Unfortunately, the FnMut trait defines call_mut as just

pub trait FnMut<Args>: FnOnce<Args> {
    extern "rust-call" fn call_mut(&mut self, args: Args) -> Self::Output;
}

In particular, there's no connection between the lifetime of the self reference itself and the returned value, and so it is illegal to try to return InnerClosure<'b> which has that link. This is why the compiler is complaining that the lifetime is too short to be able to reborrow.

This is extremely similar to the Iterator::next method, and the code here is failing for basically the same reason that one cannot have an iterator over references into memory that the iterator itself owns. (I imagine a "streaming iterator" (iterators with a link between &mut self and the return value in next) library would be able to provide a flat_map that works with the code nearly written: would need "closure" traits with a similar link.)

Work-arounds include:

  • the RefCell suggested by Renato Zannon, which allows seen to be borrowed as a shared &. The desugared closure code is basically the same other than changing the &mut Vec<i32> to &Vec<i32>. This change means "reborrow" of the &'b mut &'a RefCell<Vec<i32>> can just be a copy of the &'a ... out of the &mut. It's a literal copy, so the lifetime is retained.
  • avoiding the laziness of iterators, to avoid returning the inner closure, specifically.collect::<Vec<_>>()ing inside the loop to run through the whole filter_map before returning.
fn main() {
    let mut seen = vec![];
    let items = vec![vec![1i32, 2], vec![3], vec![1]];

    let a: Vec<_> = items
        .iter()
        .flat_map(|inner_numbers| {
            inner_numbers
                .iter()
                .filter_map(|&number| if !seen.contains(&number) {
                    seen.push(number);
                    Some(number)
                } else {
                    None
                })
                .collect::<Vec<_>>()
                .into_iter()
        })
        .collect();

    println!("{:?}", a);
}

I imagine the RefCell version is more efficient.

查看更多
登录 后发表回答