The documentation for mem::uninitialized
points out why it is dangerous/unsafe to use that function: calling drop
on uninitialized memory is undefined behavior.
So this code should be, I believe, undefined:
let a: TypeWithDrop = unsafe { mem::uninitialized() };
panic!("=== Testing ==="); // Destructor of `a` will be run (U.B)
However, I wrote this piece of code which works in safe Rust and does not seem to suffer from undefined behavior:
#![feature(conservative_impl_trait)]
trait T {
fn disp(&mut self);
}
struct A;
impl T for A {
fn disp(&mut self) { println!("=== A ==="); }
}
impl Drop for A {
fn drop(&mut self) { println!("Dropping A"); }
}
struct B;
impl T for B {
fn disp(&mut self) { println!("=== B ==="); }
}
impl Drop for B {
fn drop(&mut self) { println!("Dropping B"); }
}
fn foo() -> impl T { return A; }
fn bar() -> impl T { return B; }
fn main() {
let mut a;
let mut b;
let i = 10;
let t: &mut T = if i % 2 == 0 {
a = foo();
&mut a
} else {
b = bar();
&mut b
};
t.disp();
panic!("=== Test ===");
}
It always seems to execute the right destructor, while ignoring the other one. If I tried using a
or b
(like a.disp()
instead of t.disp()
) it correctly errors out saying I might be possibly using uninitialized memory. What surprised me is while panic
king, it always runs the right destructor (printing the expected string) no matter what the value of i
is.
How does this happen? If the runtime can determine which destructor to run, should the part about memory mandatorily needing to be initialized for types with Drop
implemented be removed from documentation of mem::uninitialized()
as linked above?
Using drop flags.
Rust (up to and including version 1.12) stores a boolean flag in every value whose type implements Drop
(and thus increases that type's size by one byte). That flag decides whether to run the destructor. So when you do b = bar()
it sets the flag for the b
variable, and thus only runs b
's destructor. Vice versa with a
.
Note that starting from Rust version 1.13 (at the time of this writing the beta compiler) that flag is not stored in the type, but on the stack for every variable or temporary. This is made possible by the advent of the MIR in the Rust compiler. The MIR significantly simplifies the translation of Rust code to machine code, and thus enabled this feature to move drop flags to the stack. Optimizations will usually eliminate that flag if they can figure out at compile time when which object will be dropped.
You can "observe" this flag in a Rust compiler up to version 1.12 by looking at the size of the type:
struct A;
struct B;
impl Drop for B {
fn drop(&mut self) {}
}
fn main() {
println!("{}", std::mem::size_of::<A>());
println!("{}", std::mem::size_of::<B>());
}
prints 0
and 1
respectively before stack flags, and 0
and 0
with stack flags.
Using mem::uninitialized
is still unsafe, however, because the compiler still sees the assignment to the a
variable and sets the drop flag. Thus the destructor will be called on uninitialized memory. Note that in your example the Drop
impl does not access any memory of your type (except for the drop flag, but that is invisible to you). Therefor you are not accessing the uninitialized memory (which is zero bytes in size anyway, since your type is a zero sized struct). To the best of my knowledge that means that your unsafe { std::mem::uninitialized() }
code is actually safe, because afterwards no memory unsafety can occur.
There are two questions hidden here:
- How does the compiler track which variable is initialized or not?
- Why may initializing with
mem::uninitialized()
lead to Undefined Behavior?
Let's tackle them in order.
How does the compiler track which variable is initialized or not?
The compiler injects so-called "drop flags": for each variable for which Drop
must run at the end of the scope, a boolean flag is injected on the stack, stating whether this variable needs to be disposed of.
The flag starts off "no", moves to "yes" if the variable is initialized, and back to "no" if the variable is moved from.
Finally, when comes the time to drop this variable, the flag is checked and it is dropped if necessary.
This is unrelated as to whether the compiler's flow analysis complains about potentially uninitialized variables: only when the flow analysis is satisfied is code generated.
Why may initializing with mem::uninitialized()
lead to Undefined Behavior?
When using mem::uninitialized()
you make a promise to the compiler: don't worry, I'm definitely initializing this.
As far as the compiler is concerned, the variable is therefore fully initialized, and the drop flag is set to "yes" (until you move out of it).
This, in turn, means that Drop
will be called.
Using an uninitialized object is Undefined Behavior, and the compiler calling Drop
on an uninitialized object on your behalf counts as "using it".
Bonus:
In my tests, nothing weird happened!
Note that Undefined Behavior means that anything can happen; anything, unfortunately, also includes "seems to work" (or even "works as intended despite the odds").
In particular, if you do NOT access the object's memory in Drop::drop
(just printing), then it's very likely that everything will just work. If you do access it, however, you might see weird integers, pointers pointing into the wild, etc...
And if the optimizer is clever, even without accessing it, it might do weird things! Since we are using LLVM, I invite you to read What every C programmer should know about Undefined Behavior by Chris Lattner (LLVM's father).
First, there are drop flags - runtime information for tracking which variables have been initialized. If a variable was not assigned to, drop()
will not be executed for it.
In stable, the drop flag is currently stored within the type itself. Writing uninitialized memory to it can cause undefined behavior as to whether drop()
will or will not be called. This will soon be out of date information because the drop flag is moved out of the type itself in nightly.
In nightly Rust, if you assign uninitialized memory to a variable, it would be safe to assume that drop()
will be executed. However, any useful implementation of drop()
will operate on the value. There is no way to detect if the type is properly initialized or not within the Drop
trait implementation: it could result in trying to free an invalid pointer or any other random thing, depending on the Drop
implementation of the type. Assigning uninitialized memory to a type with Drop
is ill-advised anyway.