How to write a safe wrap for HashMap with default

2019-02-11 09:01发布

问题:

I implemented a wrap for HashMap with default values and I would like to know if it's safe.

When get is called, the internal map may be resized and previous references to values (obtained with get) would be pointing to invalid address. I tried to solve this problem using the idea that "all problems in computer science can be solved by another level of indirection" (Butler Lampson). I would like to know if this trick makes this code safe.

use std::cell::UnsafeCell;
use std::collections::HashMap;
use std::hash::Hash;

pub struct DefaultHashMap<I: Hash + Eq, T: Clone> {
    default: T,
    map: UnsafeCell<HashMap<I, Box<T>>>,
}

impl<I: Hash + Eq, T: Clone> DefaultHashMap<I, T> {
    pub fn new(default: T) -> Self {
        DefaultHashMap {
            default: default,
            map: UnsafeCell::new(HashMap::new()),
        }
    }

    pub fn get_mut(&mut self, v: I) -> &mut T {
        let m = unsafe { &mut *self.map.get() };
        m.entry(v).or_insert_with(|| Box::new(self.default.clone()))
    }

    pub fn get(&self, v: I) -> &T {
        let m = unsafe { &mut *self.map.get() };
        m.entry(v).or_insert_with(|| Box::new(self.default.clone()))
    }
}

#[test]
fn test() {
    let mut m = DefaultHashMap::new(10usize);
    *m.get_mut(4) = 40;
    let a = m.get(4);
    for i in 1..1024 {
        m.get(i);
    }
    assert_eq!(a, m.get(4));
    assert_eq!(40, *m.get(4));
}

(Playground)

回答1:

Since you cannot1 mutate the value returned from get, I'd just return a reference to the default value when the value is missing. When you call get_mut however, you can then add the value to the map and return the reference to the newly-added value.

This has the nice benefit of not needing any unsafe code.

use std::{borrow::Borrow, collections::HashMap, hash::Hash};

pub struct DefaultHashMap<K, V> {
    default: V,
    map: HashMap<K, V>,
}

impl<K, V> DefaultHashMap<K, V>
where
    K: Hash + Eq,
    V: Clone,
{
    pub fn new(default: V) -> Self {
        DefaultHashMap {
            default,
            map: HashMap::new(),
        }
    }

    pub fn get_mut(&mut self, v: K) -> &mut V {
        let def = &self.default;
        self.map.entry(v).or_insert_with(|| def.clone())
    }

    pub fn get<B>(&self, v: B) -> &V
    where
        B: Borrow<K>,
    {
        self.map.get(v.borrow()).unwrap_or(&self.default)
    }
}

#[test]
fn test() {
    let mut m = DefaultHashMap::new(10usize);
    *m.get_mut(4) = 40;
    let a = m.get(4);
    for i in 1..1024 {
        m.get(i);
    }
    assert_eq!(a, m.get(4));
    assert_eq!(40, *m.get(4));
}

[1]: Technically this will have different behavior if your default value contains internal mutability. In that case, modifications to the default value would apply across the collection. If that's a concern, you'd need to use a solution closer to your original.



回答2:

I think that you are covered by the borrowing rules here.

Applying the Mutability XOR Aliasing principle here, unsafety would crop up if you could maintain multiple paths to the same value and mutate something at the same time.

In your case, however:

  • while the internal HashMap can be mutated even through an aliasable reference to DefaultHashMap, nobody has a reference into the HashMap itself
  • while there are references into the Box, there is no possibility here to erase a Box, so no dangling pointer from here
  • since you take care to preserve the borrowing relationship (ie, &mut T is only obtained through a &mut DefaultHashMap), it is not possible to have a &mut T and an alias into it

So, your short example looks safe, however be especially wary of not accidentally introducing a method on &DefaultHashMap which would allow to modify an existing value as this would be a short road to dangling pointers.

Personally, I would execute all tests with an Option<String>.



标签: rust