I have read somewhere that in a language that features pointers, it is not possible for the compiler to decide fully at compile time whether all pointers are used correctly and/or are valid (refer to an alive object) for various reasons, since that would essentially constitute solving the halting problem. That is not surprising, intuitively, because in this case, we would be able to infer the runtime behavior of a program during compile-time, similarly to what's stated in this related question.
However, from what I can tell, the Rust language requires that pointer checking be done entirely at compile time (there's no undefined behavior related to pointers, "safe" pointers at least, and there's no "invalid pointer" or "null pointer" runtime exception either).
Assuming that the Rust compiler doesn't solve the halting problem, where does the fallacy lie?
- Is it the case that pointer checking isn't done entirely at compile-time, and Rust's smart pointers still introduce some runtime overhead compared to, say, raw pointers in C?
- Or is it possible that the Rust compiler can't make fully correct decisions, and it sometimes needs to Just Trust The Programmer™, probably using one of the lifetime annotations (the ones with the
<'lifetime_ident>
syntax)? In this case, does this mean that the pointer/memory safety guarantee is not 100%, and still relies on the programmer writing correct code? - Another possibility is that Rust pointers are non-"universal" or restricted in some sense, so that the compiler can infer their properties entirely during compile-time, but they are not as useful as e. g. raw pointers in C or smart pointers in C++.
- Or maybe it is something completely different and I'm misinterpreting one or more of
{ "pointer", "safety", "guaranteed", "compile-time" }
.
Most of the safety of Rust references is guaranteed by strict rules:
&
), you can clone this reference and pass it around, but not create a mutable&mut
reference out of it.&mut
) reference to an object exists, no other reference to this object can exist.'a
).So in terms of expressiveness, we are effectively more limited than when using plain raw pointers (for example, building a graph structure is not possible using only safe references), but these rules can effectively be completely checked at compile-time.
Yet, it is still possible to use raw pointers, but you have to enclose the code dealing with them in a
unsafe { /* ... */ }
block, telling to the compiler "Trust me, I know what I am doing here". That is what some special smart pointers do internally, such asRefCell
, which allows you to have these rules checked at runtime rather than compile-time, to gain expressiveness.There are special runtime-checks for things that can't be checked at compile time. These are usually found in the
cell
crate. But in general, Rust checks everything at compile time and should produce the same code as you would in C (if your C-code isn't doing undefined stuff).If the compiler cannot make the correct decision you get a compile time error telling you that the compiler cannot verify what you are doing. This might also restrict you from stuff you know is correct, but the compiler doesn't. You can always go to
unsafe
code in that case. But as you correctly assumed, then the compiler relies partly on the programmer.The compiler checks the function's implementation, to see if it does exactly what the lifetimes say it does. Then, at the call-site of the function, it checks if the programmer uses the function correctly. This is similar to type-checking. A C++ compiler checks if you are returning an object of the correct type. Then it checks at the call-site if the returned object is stored in a variable of the correct type. At no time can the programmer of a function break the promise (except if
unsafe
is used, but you can always let the compiler enforce that nounsafe
is used in your project)Rust is continuously improved. More things may get legal in Rust once the compiler becomes smarter.
There's a few things that can go wrong in C:
These don't happen in safe Rust.
Box
to allocate your objects (similar but not equal to aunique_ptr
in C++)Box
es automatically free memory.There's a few things that can go wrong in C++:
free
. You can still create dangling references:auto x = make_unique<int>(42);
auto& y = *x;
x.reset();
y = 99;
Rust fixes those:
y
exists, you may not modifyx
. This is checked at compile time and cannot be circumvented by more levels of indirection or structs.Rust doesn't prove you all pointers are used correctly. You may still write bogus programs. Rust proves that you are not using invalid pointers. Rust proves that you never have null-pointers. Rust proves that you never have two pointers to the same object, execept if all these pointers are non-mutable (const). Rust does not allow you to write any program (since that would include programs that violate memory safety). Right now Rust still prevents you from writing some useful programs, but there are plans to allow more (legal) programs to be written in safe Rust.
Revisiting the example in your referenced question about the halting problem:
The above C++ code would look one of two ways in Rust:
For an arbitrary
bar
you cannot prove this, because it might access global state. Rust soon getsconst
functions, which can be used to compute stuff at compile-time (similar toconstexpr
). Ifbar
isconst
, it becomes trivial to prove ifself.a
is set to1
at compile-time. Other than that, withoutpure
functions or other restrictions of the function content, you can never prove whetherself.a
is set to1
or not.Rust currently doesn't care whether your code is called or not. It cares whether the memory of
self.a
still exists during the assignment.self.bar()
can never destroyself
(except inunsafe
code). Thereforself.a
will always be available inside theif
branch.Disclaimer: I'm in a bit of a hurry, so this is a bit meandering. Feel free to clean it up.
The One Sneaky Trick That Language Designers Hate™ is basically this: Rust can only reason about the
'static
lifetime (used for global variables and other whole-program lifetime things) and the lifetime of stack (i.e. local) variables: it cannot express or reason about the lifetime of heap allocations.This means a few things. First of all, all of the library types that deal with heap allocations (i.e.
Box<T>
,Rc<T>
,Arc<T>
) all own the thing they point to. As a result, they don't actually need lifetimes in order to exist.Where you do need lifetimes is when you're accessing the contents of a smart pointer. For example:
What is happening behind the scenes on that second line is this:
In other words, because
Box
isn't magic, we have to tell the compiler how to turn it into a regular, run of the mill borrowed pointer. This is what theDeref
andDerefMut
traits are for. This raises the question: what, exactly, is the lifetime ofheap_ref
?The answer to this is in the definition of
DerefMut
(from memory because I'm in a hurry):Like I said before, Rust absolutely cannot talk about "heap lifetimes". Instead, it has to tie the lifetime of the heap-allocated
i32
to the only other lifetime it has on hand: the lifetime of theBox
.What this means is that "complicated" things don't have an expressible lifetime, and thus have to own the thing they manage. When you convert a complicated smart pointer/handle into a simple borrowed pointer, that is the moment that you have to introduce a lifetime, and you usually just use the lifetime of the handle itself.
Actually, I should clarify: by "lifetime of the handle", I really mean "the lifetime of the variable in which the handle is currently being stored": lifetimes are really for storage, not for values. This is typically why newcomers to Rust get tripped up when they can't work out why they can't do something like:
"But... I know that the box will continue to live on, why does the compiler say it doesn't?!" Because Rust can't reason about heap lifetimes and must resort to tying the lifetime of
&x
to the variablex
, not the heap allocation it happens to point to.