I implemented a wrap for HashMap
with default values and I would like to know if it's safe.
When get
is called, the internal map may be resized and previous references to values (obtained with get
) would be pointing to invalid address. I tried to solve this problem using the idea that "all problems in computer science can be solved by another level of indirection" (Butler Lampson). I would like to know if this trick makes this code safe.
use std::cell::UnsafeCell;
use std::collections::HashMap;
use std::hash::Hash;
pub struct DefaultHashMap<I: Hash + Eq, T: Clone> {
default: T,
map: UnsafeCell<HashMap<I, Box<T>>>,
}
impl<I: Hash + Eq, T: Clone> DefaultHashMap<I, T> {
pub fn new(default: T) -> Self {
DefaultHashMap {
default: default,
map: UnsafeCell::new(HashMap::new()),
}
}
pub fn get_mut(&mut self, v: I) -> &mut T {
let m = unsafe { &mut *self.map.get() };
m.entry(v).or_insert_with(|| Box::new(self.default.clone()))
}
pub fn get(&self, v: I) -> &T {
let m = unsafe { &mut *self.map.get() };
m.entry(v).or_insert_with(|| Box::new(self.default.clone()))
}
}
#[test]
fn test() {
let mut m = DefaultHashMap::new(10usize);
*m.get_mut(4) = 40;
let a = m.get(4);
for i in 1..1024 {
m.get(i);
}
assert_eq!(a, m.get(4));
assert_eq!(40, *m.get(4));
}
(Playground)
Since you cannot1 mutate the value returned from get
, I'd just return a reference to the default value when the value is missing. When you call get_mut
however, you can then add the value to the map and return the reference to the newly-added value.
This has the nice benefit of not needing any unsafe
code.
use std::{borrow::Borrow, collections::HashMap, hash::Hash};
pub struct DefaultHashMap<K, V> {
default: V,
map: HashMap<K, V>,
}
impl<K, V> DefaultHashMap<K, V>
where
K: Hash + Eq,
V: Clone,
{
pub fn new(default: V) -> Self {
DefaultHashMap {
default,
map: HashMap::new(),
}
}
pub fn get_mut(&mut self, v: K) -> &mut V {
let def = &self.default;
self.map.entry(v).or_insert_with(|| def.clone())
}
pub fn get<B>(&self, v: B) -> &V
where
B: Borrow<K>,
{
self.map.get(v.borrow()).unwrap_or(&self.default)
}
}
#[test]
fn test() {
let mut m = DefaultHashMap::new(10usize);
*m.get_mut(4) = 40;
let a = m.get(4);
for i in 1..1024 {
m.get(i);
}
assert_eq!(a, m.get(4));
assert_eq!(40, *m.get(4));
}
[1]: Technically this will have different behavior if your default value contains internal mutability. In that case, modifications to the default value would apply across the collection. If that's a concern, you'd need to use a solution closer to your original.
I think that you are covered by the borrowing rules here.
Applying the Mutability XOR Aliasing principle here, unsafety would crop up if you could maintain multiple paths to the same value and mutate something at the same time.
In your case, however:
- while the internal
HashMap
can be mutated even through an aliasable reference to DefaultHashMap
, nobody has a reference into the HashMap
itself
- while there are references into the
Box
, there is no possibility here to erase a Box
, so no dangling pointer from here
- since you take care to preserve the borrowing relationship (ie,
&mut T
is only obtained through a &mut DefaultHashMap
), it is not possible to have a &mut T
and an alias into it
So, your short example looks safe, however be especially wary of not accidentally introducing a method on &DefaultHashMap
which would allow to modify an existing value as this would be a short road to dangling pointers.
Personally, I would execute all tests with an Option<String>
.