When is it necessary to circumvent Rust's borr

2020-04-16 17:22发布

I'm implementing Conway's game of life to teach myself Rust. The idea is to implement a single-threaded version first, optimize it as much as possible, then do the same for a multi-threaded version.

I wanted to implement an alternative data layout which I thought might be more cache-friendly. The idea is to store the status of two cells for each point on a board next to each other in memory in a vector, one cell for reading the current generation's status from and one for writing the next generation's status to, alternating the access pattern for each generation's computation (which can be determined at compile time).

The basic data structures are as follows:

#[repr(u8)]
pub enum CellStatus {
    DEAD,
    ALIVE,
}

/** 2 bytes */
pub struct CellRW(CellStatus, CellStatus);

pub struct TupleBoard {
    width: usize,
    height: usize,
    cells: Vec<CellRW>,
}

/** used to keep track of current pos with iterator e.g. */
pub struct BoardPos {
    x_pos: usize,
    y_pos: usize,
    offset: usize,
}

pub struct BoardEvo {
    board: TupleBoard,
}

The function that is causing me troubles:

impl BoardEvo {
    fn evolve_step<T: RWSelector>(&mut self) {
        for (pos, cell) in self.board.iter_mut() {
            //pos: BoardPos, cell: &mut CellRW
            let read: &CellStatus = T::read(cell); //chooses the right tuple half for the current evolution step
            let write: &mut CellStatus = T::write(cell);

            let alive_count = pos.neighbours::<T>(&self.board).iter() //<- can't borrow self.board again!
                    .filter(|&&status| status == CellStatus::ALIVE)
                    .count();

            *write = CellStatus::evolve(*read, alive_count);
        }
    }
}

impl BoardPos {
    /* ... */
    pub fn neighbours<T: RWSelector>(&self, board: &BoardTuple) -> [CellStatus; 8] {
        /* ... */
    }
}

The trait RWSelector has static functions for reading from and writing to a cell tuple (CellRW). It is implemented for two zero-sized types L and R and is mainly a way to avoid having to write different methods for the different access patterns.

The iter_mut() method returns a BoardIter struct which is a wrapper around a mutable slice iterator for the cells vector and thus has &mut CellRW as Item type. It is also aware of the current BoardPos (x and y coordinates, offset).

I thought I'd iterate over all cell tuples, keep track of the coordinates, count the number of alive neighbours (I need to know coordinates/offsets for this) for each (read) cell, compute the cell status for the next generation and write to the respective another half of the tuple.

Of course, in the end, the compiler showed me the fatal flaw in my design, as I borrow self.board mutably in the iter_mut() method and then try to borrow it again immutably to get all the neighbours of the read cell.

I have not been able to come up with a good solution for this problem so far. I did manage to get it working by making all references immutable and then using an UnsafeCell to turn the immutable reference to the write cell into a mutable one. I then write to the nominally immutable reference to the writing part of the tuple through the UnsafeCell. However, that doesn't strike me as a sound design and I suspect I might run into issues with this when attempting to parallelize things.

Is there a way to implement the data layout I proposed in safe/idiomatic Rust or is this actually a case where you actually have to use tricks to circumvent Rust's aliasing/borrow restrictions?

Also, as a broader question, is there a recognizable pattern for problems which require you to circumvent Rust's borrow restrictions?

标签: rust
1条回答
放我归山
2楼-- · 2020-04-16 17:48

When is it necessary to circumvent Rust's borrow checker?

It is needed when:

  • the borrow checker is not advanced enough to see that your usage is safe
  • you do not wish to (or cannot) write the code in a different pattern

As a concrete case, the compiler cannot tell that this is safe:

let mut array = [1, 2];
let a = &mut array[0];
let b = &mut array[1];

The compiler doesn't know what the implementation of IndexMut for a slice does at this point of compilation (this is a deliberate design choice). For all it knows, arrays always return the exact same reference, regardless of the index argument. We can tell that this code is safe, but the compiler disallows it.

You can rewrite this in a way that is obviously safe to the compiler:

let mut array = [1, 2];
let (a, b) = array.split_at_mut(1);
let a = &mut a[0];
let b = &mut b[0];

How is this done? split_at_mut performs a runtime check to ensure that it actually is safe:

fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]) {
    let len = self.len();
    let ptr = self.as_mut_ptr();

    unsafe {
        assert!(mid <= len);

        (from_raw_parts_mut(ptr, mid),
         from_raw_parts_mut(ptr.offset(mid as isize), len - mid))
    }
}

For an example where the borrow checker is not yet as advanced as it can be, see What are non-lexical lifetimes?.

I borrow self.board mutably in the iter_mut() method and then try to borrow it again immutably to get all the neighbours of the read cell.

If you know that the references don't overlap, then you can choose to use unsafe code to express it. However, this means you are also choosing to take on the responsibility of upholding all of Rust's invariants and avoiding undefined behavior.

The good news is that this heavy burden is what every C and C++ programmer has to (or at least should) have on their shoulders for every single line of code they write. At least in Rust, you can let the compiler deal with 99% of the cases.

In many cases, there's tools like Cell and RefCell to allow for interior mutation. In other cases, you can rewrite your algorithm to take advantage of a value being a Copy type. In other cases you can use an index into a slice for a shorter period. In other cases you can have a multi-phase algorithm.

If you do need to resort to unsafe code, then try your best to hide it in a small area and expose safe interfaces.

Above all, many common problems have been asked about (many times) before:

查看更多
登录 后发表回答