I have a set of objects that need to know each other to cooperate. These objects are stored in a container. I'm trying to get a very simplistic idea of how to architecture my code in Rust.
Let's use an analogy. A Computer
contains:
- 1
Mmu
- 1
Ram
- 1
Processor
In Rust:
struct Computer {
mmu: Mmu,
ram: Ram,
cpu: Cpu,
}
For anything to work, the Cpu
needs to know about the Mmu
it is linked to, and the Mmu
needs to know the Ram
it is linked to.
I do not want the Cpu
to aggregate by value the Mmu
. Their lifetimes differ: the Mmu
can live its own life by itself. It just happens that I can plug it to the Cpu
. However, there is no sense in creating a Cpu
without an Mmu
attached to it, since it would not be able to do its job. The same relation exists between Mmu
and Ram
.
Therefore:
- A
Ram
can live by itself. - An
Mmu
needs aRam
. - A
Cpu
needs anMmu
.
How can I model that kind of design in Rust, one with a struct whose fields know about each other.
In C++, it would be along the lines of:
>
struct Ram
{
};
struct Mmu
{
Ram& ram;
Mmu(Ram& r) : ram(r) {}
};
struct Cpu
{
Mmu& mmu;
Cpu(Mmu& m) : mmu(m) {}
};
struct Computer
{
Ram ram;
Mmu mmu;
Cpu cpu;
Computer() : ram(), mmu(ram), cpu(mmu) {}
};
Here is how I started translating that in Rust:
struct Ram;
struct Mmu<'a> {
ram: &'a Ram,
}
struct Cpu<'a> {
mmu: &'a Mmu<'a>,
}
impl Ram {
fn new() -> Ram {
Ram
}
}
impl<'a> Mmu<'a> {
fn new(ram: &'a Ram) -> Mmu<'a> {
Mmu {
ram: ram
}
}
}
impl<'a> Cpu<'a> {
fn new(mmu: &'a Mmu) -> Cpu<'a> {
Cpu {
mmu: mmu,
}
}
}
fn main() {
let ram = Ram::new();
let mmu = Mmu::new(&ram);
let cpu = Cpu::new(&mmu);
}
That is fine and all, but now I just can't find a way to create the Computer
struct.
I started with:
struct Computer<'a> {
ram: Ram,
mmu: Mmu<'a>,
cpu: Cpu<'a>,
}
impl<'a> Computer<'a> {
fn new() -> Computer<'a> {
// Cannot do that, since struct fields are not accessible from the initializer
Computer {
ram: Ram::new(),
mmu: Mmu::new(&ram),
cpu: Cpu::new(&mmu),
}
// Of course cannot do that, since local variables won't live long enough
let ram = Ram::new();
let mmu = Mmu::new(&ram);
let cpu = Cpu::new(&mmu);
Computer {
ram: ram,
mmu: mmu,
cpu: cpu,
}
}
}
Okay, whatever, I won't be able to find a way to reference structure fields between them. I thought I could come up with something by creating the Ram
, Mmu
and Cpu
on the heap; and put that inside the struct:
struct Computer<'a> {
ram: Box<Ram>,
mmu: Box<Mmu<'a>>,
cpu: Box<Cpu<'a>>,
}
impl<'a> Computer<'a> {
fn new() -> Computer<'a> {
let ram = Box::new(Ram::new());
// V-- ERROR: reference must be valid for the lifetime 'a
let mmu = Box::new(Mmu::new(&*ram));
let cpu = Box::new(Cpu::new(&*mmu));
Computer {
ram: ram,
mmu: mmu,
cpu: cpu,
}
}
}
Yeah that's right, at this point in time Rust has no way to know that I'm going to transfer ownership of let ram = Box::new(Ram::new())
to the Computer
, so it will get a lifetime of 'a
.
I've been trying various more or less hackish ways to get that right, but I just can't come up with a clean solution. The closest I've come is to drop the reference and use an Option
, but then all my methods have to check whether the Option
is Some
or None
, which is rather ugly.
I think I'm just on the wrong track here, trying to map what I would do in C++ in Rust, but that doesn't work. That's why I would need help finding out what is the idiomatic Rust way of creating this architecture.
In this answer I will discuss two approaches to solving this problem, one in safe Rust with zero dynamic allocation and very little runtime cost, but which can be constricting, and one with dynamic allocation that uses unsafe invariants.
The Safe Way (
Cell<Option<&'a T>
)Playground
Contrary to popular belief, self-references are in fact possible in safe Rust, and even better, when you use them Rust will continue to enforce memory safety for you.
The main "hack" needed to get self, recursive, or cyclical references using
&'a T
is the use of aCell<Option<&'a T>
to contain the reference. You won't be able to do this without theCell<Option<T>>
wrapper.The clever bit of this solution is splitting initial creation of the struct from proper initialization. This has the unfortunate downside that it's possible to use this struct incorrectly by initializing it and using it before calling
freeze
, but it can't result in memory unsafety without further usage ofunsafe
.The initial creation of the struct only sets the stage for our later hackery - it creates the
Ram
, which has no dependencies, and sets theCpu
andMmu
to their unusable state, containingCell::new(None)
instead of the references they need.Then, we call the
freeze
method, which deliberately holds a borrow of self with lifetime'a
, or the full lifetime of the struct. After we call this method, the compiler will prevent us from getting mutable references to theComputer
or moving theComputer
, as either could invalidate the reference that we are holding. Thefreeze
method then sets up theCpu
andMmu
appropriately by setting theCell
s to containSome(&self.cpu)
orSome(&self.ram)
respectively.After
freeze
is called our struct is ready to be used, but only immutably.The Unsafe Way (
Box<T>
never movesT
)Playground
NOTE: This solution is not completely safe given an unrestricted interface to
Computer
- care must be taken to not allow aliasing or removal of theMmu
orRam
in the public interface of Computer.This solution instead uses the invariant that data stored inside of a
Box
will never move - it's address will never change - as long as theBox
remains alive. Rust doesn't allow you to depend on this in safe code, since moving aBox
can cause the memory behind it be deallocated, thereby leaving a dangling pointer, but we can rely on it in unsafe code.The main trick in this solution is to use raw pointers into the contents of the
Box<Mmu>
andBox<Ram>
to store references into them in theCpu
andMmu
respectively. This gets you a mostly safe interface, and doesn't prevent you from moving theComputer
around or even mutating it in restricted cases.An Ending Note
All of this said, I don't think either of these should really be the way you approach this problem. Ownership is a central concept in Rust, and it permeates the design choices of almost all code. If the
Mmu
owns theRam
and theCpu
owns theMmu
, that's the relationship you should have in your code. If you useRc
, you can even maintain the ability to share the underlying pieces, albeit immutably.I'd suggest adding an exploder (a term I just made up). It's a function that consumes the value and returns all the constituent parts:
Here, we go ahead and let the CPU have a MMU by value. Then, when we are done with the CPU, we break it down to the component parts which we can then reuse.