Not live long enough with CSV and dataflow

2019-09-22 05:41发布

fn main() {
    timely::execute_from_args(std::env::args().skip(0), move |worker| {

        let (mut input, probe) = worker.dataflow::<_, _, _>(|scope| {
            let (input, data) = scope.new_collection();
            let probe = data.inspect(|x| println!("observed data: {:?}", x)).probe();

            (input, probe)
        });

        let mut rdr = csv::ReaderBuilder::new()
            .has_headers(false)
            .flexible(true)
            .delimiter(b'\t')
            .from_reader(io::stdin());

        for result in rdr.deserialize() {
            let record = result.expect("a CSV record");

            let mut vec = Vec::new();
            for i in 0..13 {
                vec.push(&record[i]);

            }

            input.insert(vec);
        }
    });
}

The error is record can not live long enough. I try to read the CSV record and read it as a vector. Then insert records in to the data flow. I can run them separate. I can read the CSv as vector and use the data flow in other place.

标签: rust
1条回答
做自己的国王
2楼-- · 2019-09-22 05:56

The problem is that you are pushing to the Vec a borrowed value: &record[i]. The & means borrow, and as a consequence the original value record must outlive the borrower vec.

That might seem fine (both are in the for body, and thus both have the same lifetime, i.e., they both live inside the for body and therefore none outlive each other), but this doesn't happen because the line input.insert(vec) is moving vec. What this means is that vec now becomes owned by input and hence it lives as long as input (as far as I understand). Now, because input is outside the for body, the moved vec lives as long as input and therefore outlives the record[i]s.

There are a few solutions, but all of them try to remove the dependency between record and input:

  1. If the record is an array of primitive values, or something that implements the Copy trait, you can simply omit the borrow and the value will be copied into the vector: vec.push(record[i]).
  2. Clone the record value into the vector: vec.push(record[i].clone()). This forces the creation of a clone, which as above, the vec becomes the owner, avoiding the borrow.
  3. If the elements in the record array don't implement Copy nor Clone, you have to move it. Because the value is in an array, you have to move the array fully (it can't have elements that haven't been removed). One solution is to transform it into an iterator that moves out the values one by one, and then push them into the vector:

    for element in record.into_iter().take(13) {
        vec.push(element)
    }
    
  4. Replace the record value with a different value. One final solution in order to move only parts of the array is to replace the element in the array with something else. This means that although you remove an element from the array, you replace it with something else, and the array continues to be valid.

    for i in 0..13 {
        vec.push(std::mem::replace(&record[i], Default::default()));
    }
    

    You can replace Default::default() with another value if you want to.

I hope this helps. I'm still a noob in Rust, so improvements and critique on the answer are accepted :)

查看更多
登录 后发表回答