An example from Programming in Rust (PDF):
#[derive(Debug)]
enum IntOrString {
I(isize),
S(String),
}
fn corrupt_enum() {
let mut s = IntOrString::S(String::new());
match s {
IntOrString::I(_) => (),
IntOrString::S(ref p) => {
s = IntOrString::I(0xdeadbeef);
// Now p is a &String, pointing at memory
// that is an int of our choosing!
}
}
}
corrupt_enum();
The compiler does not allow this:
error[E0506]: cannot assign to `s` because it is borrowed
--> src/main.rs:13:17
|
12 | IntOrString::S(ref p) => {
| ----- borrow of `s` occurs here
13 | s = IntOrString::I(0xdeadbeef);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ assignment to borrowed `s` occurs here
But suppose it did; how is it that
Now
p
is a&String
, pointing at memory that is an int of our choosing!
is a bad thing?
Let's make up a memory layout for the types involved.
IntOrString
will have one byte to determine which variant it is (0
= number,1
= string), followed by 4 bytes that will either be a number or the address to the beginning of a set of UTF-8 characters.Let's allocate
s
in memory at 0x100. The variant is at 0x100 and the value is at 0x101, 0x102, 0x103, 0x104. Additionally, let's say that the contents of the value is the pointer0xABCD
; this is where the bytes of the string live.When the match arm
IntOrString::S(ref p)
is used,p
will be set to the value0x101
- it's a reference to the value and the value starts at 0x101. When you try to usep
, the processor will go to the address0x101
, read the value (an address), and then read the data from that address.If the compiler allowed you to change
s
at this point, then the new bytes of the new data would replace the value stored at0x101
. In the example, the "address" stored at the value would now point to somewhere arbitrary (0xDEADBEEF
). If we tried to use the "string", we'd start reading bytes of memory that are highly unlikely to correspond to UTF-8 data.None of this is academic, this exact kind of problem can occur in a well-formed C program. In the good cases, the program will crash. In bad cases, it's possible to read data in the program you aren't supposed to. It's even possible to inject shellcode that then gives an attacker the ability to run code they wrote inside your program.
Note that the memory layout above is very simplified, and an actual
String
is larger and more complicated.