Cannot split a string into string slices with expl

2019-01-29 13:29发布

问题:

I'm writing a library that should read from something implementing the BufRead trait; a network data stream, standard input, etc. The first function is supposed to read a data unit from that reader and return a populated struct filled mostly with &'a str values parsed from a frame from the wire.

Here is a minimal version:

mod mymod {
    use std::io::prelude::*;
    use std::io;

    pub fn parse_frame<'a, T>(mut reader: T)
    where
        T: BufRead,
    {
        for line in reader.by_ref().lines() {
            let line = line.expect("reading header line");
            if line.len() == 0 {
                // got empty line; done with header
                break;
            }
            // split line
            let splitted = line.splitn(2, ':');
            let line_parts: Vec<&'a str> = splitted.collect();

            println!("{} has value {}", line_parts[0], line_parts[1]);
        }
        // more reads down here, therefore the reader.by_ref() above
        // (otherwise: use of moved value).
    }
}

use std::io;

fn main() {
    let stdin = io::stdin();
    let locked = stdin.lock();
    mymod::parse_frame(locked);
}

An error shows up which I cannot fix after trying different solutions:

error: `line` does not live long enough
  --> src/main.rs:16:28
   |
16 |             let splitted = line.splitn(2, ':');
   |                            ^^^^ does not live long enough
...
20 |         }
   |         - borrowed value only lives until here
   |
note: borrowed value must be valid for the lifetime 'a as defined on the body at 8:4...
  --> src/main.rs:8:5
   |
8  | /     {
9  | |         for line in reader.by_ref().lines() {
10 | |             let line = line.expect("reading header line");
11 | |             if line.len() == 0 {
...  |
22 | |         // (otherwise: use of moved value).
23 | |     }
   | |_____^

The lifetime 'a is defined on a struct and implementation of a data keeper structure because the &str requires an explicit lifetime. These code parts were removed as part of the minimal example.

BufReader has a lines() method which returns Result<String, Err>. I handle errors using expect or match and thus unpack the Result so that the program now has the bare String. This will then be done multiple times to populate a data structure.

Many answers say that the unwrap result needs to be bound to a variable otherwise it gets lost because it is a temporary value. But I already saved the unpacked Result value in the variable line and I still get the error.

  1. How to fix this error - could not get it working after hours trying.

  2. Does it make sense to do all these lifetime declarations just for &str in a data keeper struct? This will be mostly a readonly data structure, at most replacing whole field values. String could also be used, but have found articles saying that String has lower performance than &str - and this frame parser function will be called many times and is performance-critical.

Similar questions exist on Stack Overflow, but none quite answers the situation here.

For completeness and better understanding, following is an excerpt from complete source code as to why lifetime question came up:

Data structure declaration:

// tuple
pub struct Header<'a>(pub &'a str, pub &'a str);

pub struct Frame<'a> {
    pub frameType: String,
    pub bodyType: &'a str,
    pub port: &'a str,
    pub headers: Vec<Header<'a>>,
    pub body: Vec<u8>,
}

impl<'a> Frame<'a> {
    pub fn marshal(&'a self) {
        //TODO
        println!("marshal!");
    }
}

Complete function definition:

pub fn parse_frame<'a, T>(mut reader: T) -> Result<Frame<'a>, io::Error> where T: BufRead {

回答1:

Your problem can be reduced to this:

fn foo<'a>() {
    let thing = String::from("a b");
    let parts: Vec<&'a str> = thing.split(" ").collect();
}

You create a String inside your function, then declare that references to that string are guaranteed to live for the lifetime 'a. Unfortunately, the lifetime 'a isn't under your control — the caller of the function gets to pick what the lifetime is. That's how generic parameters work!

What would happen if the caller of the function specified the 'static lifetime? How would it be possible for your code, which allocates a value at runtime, to guarantee that the value lives longer than even the main function? It's not possible, which is why the compiler has reported an error.

Once you've gained a bit more experience, the function signature fn foo<'a>() will jump out at you like a red alert — there's a generic parameter that isn't used. That's most likely going to mean bad news.


return a populated struct filled mostly with &'a str

You cannot possibly do this with the current organization of your code. References have to point to something. You are not providing anywhere for the pointed-at values to live. You cannot return an allocated String as a string slice.

Before you jump to it, no you cannot store a value and a reference to that value in the same struct.

Instead, you need to split the code that creates the String and that which parses a &str and returns more &str references. That's how all the existing zero-copy parsers work. You could look at those for inspiration.

String has lower performance than &str

No, it really doesn't. Creating lots of extraneous Strings is a bad idea, sure, just like allocating too much is a bad idea in any language.



回答2:

Maybe the following program gives clues for others who also also having their first problems with lifetimes:

fn main() {
    // using String und &str Slice
    let my_str: String = "fire".to_owned();
    let returned_str: MyStruct = my_func_str(&my_str);
    println!("Received return value: {ret}", ret = returned_str.version);

    // using Vec<u8> und &[u8] Slice
    let my_vec: Vec<u8> = "fire".to_owned().into_bytes();
    let returned_u8: MyStruct2 = my_func_vec(&my_vec);
    println!("Received return value: {ret:?}", ret = returned_u8.version);
}


// using String -> str
fn my_func_str<'a>(some_str: &'a str) -> MyStruct<'a> {
    MyStruct {
        version: &some_str[0..2],
    }
}

struct MyStruct<'a> {
    version: &'a str,
}


// using Vec<u8> -> & [u8]
fn my_func_vec<'a>(some_vec: &'a Vec<u8>) -> MyStruct2<'a> {
    MyStruct2 {
        version: &some_vec[0..2],
    }
}

struct MyStruct2<'a> {
    version: &'a [u8],
}


标签: rust