How to update-or-insert on a Vec?

2019-01-08 02:13发布

问题:

I'm writing a data structure in Rust. It contains a Vec of key-value pairs. When inserting into the structure, I need to find a matching key and update both the key and the value (which is actually a child pointer). The code looks a bit like this, where pivots is a ref mut to Vec<Pivot> and Pivot is just a struct with two fields:

match pivots.iter_mut().find(|ref p| key <= p.min_key) { // first mutable borrow
    Some(ref mut pivot) => {
        // If there is one, insert into it and update the pivot key
        pivot.min_key = key;
        pivot.child.insert(key, value) // recursive call
    },
    // o/w, insert a new leaf at the end
    None => pivots.push(Pivot /* ... */) // second mutable borrow
}

But there's a problem. Even though I don't use the mutable iterator in the second arm of the match, the borrow checker complains that I "cannot borrow *pivots as mutable more than once at a time".

This makes perfect sense to me, because the first borrow is still in scope, even though it's not used in that case of the match. It's a little inconvenient: a cleverer checker could certainly tell that the borrows are non-overlapping. I've seen someone online advising to use early-return to avoid the problem, like this:

match pivots.iter_mut().find(|ref p| key <= p.min_key) {
    Some(ref mut pivot) => {
        pivot.min_key = key;
        pivot.child.insert(key, value);
        return
    },
    None => ()
};
pivots.push(Pivot /* ... */)

but this seems hard-to-understand, especially when it means breaking out this code into its own function to allow the return. Is there a more idiomatic way to perform the update-or-insert operation?

回答1:

There is a merged RFC "non-lexical lifetimes" which solves this in the long run. Using the non-lexical lifetimes in Rust 2018, available in Rust 1.31, your code works as-is:

Playground

use std::collections::HashMap;

pub struct Pivot {
    pub min_key: u64,
    pub child: HashMap<u64, ()>,
}

fn update_or_append(pivots: &mut Vec<Pivot>, key: u64, value: ()) {
    match pivots.iter_mut().find(|ref p| key <= p.min_key) {
        Some(pivot) => {
            // If there is one, insert into it and update the pivot key
            pivot.min_key = key;
            pivot.child.insert(key, value);
            return;
        }
        // o/w insert a new leaf at the end
        None => {
            let mut m = HashMap::new();
            m.insert(key, value);
            pivots.push(Pivot {
                min_key: key,
                child: m,
            });
        }
    }
}

fn main() {
    let mut pivots = Vec::new();
    update_or_append(&mut pivots, 100, ());
}

Before Rust 2018, you can workaround it with some additional control flow handling.

You could have your match produce a bool value whether the update happened or not, and have a conditional block below using that value to append. I consider putting the "update-or-append" logic into a separate function (using return after the update) the more idiomatic approach:

Playground

use std::collections::HashMap;

pub struct Pivot {
    pub min_key: u64,
    pub child: HashMap<u64, ()>,
}

fn update_or_append(pivots: &mut Vec<Pivot>, key: u64, value: ()) {
    if let Some(pivot) = pivots.iter_mut().find(|ref p| key <= p.min_key) {
        // If there is one, insert into it and update the pivot key
        pivot.min_key = key;
        pivot.child.insert(key, value);
        return;
    }
    // otherwise insert a new leaf at the end
    let mut m = HashMap::new();
    m.insert(key, value);
    pivots.push(Pivot {
        min_key: key,
        child: m,
    });
}

fn main() {
    let mut pivots = Vec::new();
    update_or_append(&mut pivots, 100, ());
}

Using a bool to track whether the update happened:

Playground

use std::collections::HashMap;

pub struct Pivot {
    pub min_key: u64,
    pub child: HashMap<u64, ()>,
}

fn update_or_append(pivots: &mut Vec<Pivot>, key: u64, value: ()) {
    let updated = match pivots.iter_mut().find(|ref p| key <= p.min_key) {
        Some(pivot) => {
            // If there is one, insert into it and update the pivot key
            pivot.min_key = key;
            pivot.child.insert(key, value);
            true
        }
        // o/w insert a new leaf at the end below
        None => false,
    };
    if !updated {
        let mut m = HashMap::new();
        m.insert(key, value);
        pivots.push(Pivot {
            min_key: key,
            child: m,
        });
    }
}

fn main() {
    let mut pivots = Vec::new();
    update_or_append(&mut pivots, 100, ());
}


回答2:

It seems like the best way to do this is to use the index instead of an iterator.

match pivots.iter().position(|ref p| key <= p.min_key) {
    Some(i) => {
        // If there is one, insert into it and update the pivot key
        let pivot = &mut pivots[i];
        pivot.min_key = key;
        pivot.child.insert(key, value)
    },
    // o/w, insert a new leaf at the end
    None => pivots.push(Pivot /* ... */)
}

This way, there's no need for iter_mut. I'm still not entirely happy with this alternative, because it means using an explicit index instead of an iterator. This is fine for a Vec but wouldn't work for a container with a structure that doesn't have O(1) random-access indexing.

I'd accept a different answer that lets me avoid using an index.