Destruction order involving temporaries in Rust

2019-06-15 08:30发布

问题:

In C++ (please correct me if wrong), a temporary bound via constant reference is supposed to outlive the expression it is bound to. I assumed the same was true in Rust, but I get two different behaviors in two different cases.

Consider:

struct A;
impl Drop for A { fn drop(&mut self) { println!("Drop A.") } }

struct B(*const A);
impl Drop for B { fn drop(&mut self) { println!("Drop B.") } }

fn main() {
    let _ = B(&A as *const A); // B is destroyed after this expression itself.
}

The output is:

Drop B.
Drop A.

This is what you would expect. But now if you do:

fn main() {
    let _b = B(&A as *const A); // _b will be dropped when scope exits main()
}

The output is:

Drop A.
Drop B.

This is not what I expected.

Is this intended and if so then what is the rationale for the difference in behavior in the two cases?

I am using Rust 1.12.1.

回答1:

Temporaries are dropped at the end of the statement, just like in C++. However, IIRC, the order of destruction in Rust is unspecified (we'll see the consequences of this below), though the current implementation seems to simply drop values in reverse order of construction.

There's a big difference between let _ = x; and let _b = x;. _ isn't an identifier in Rust: it's a wildcard pattern. Since this pattern doesn't find any variables, the final value is effectively dropped at the end of the statement.

On the other hand, _b is an identifier, so the value is bound to a variable with that name, which extends its lifetime until the end of the function. However, the A instance is still a temporary, so it will be dropped at the end of the statement (and I believe C++ would do the same). Since the end of the statement comes before the end of the function, the A instance is dropped first, and the B instance is dropped second.

To make this clearer, let's add another statement in main:

fn main() {
    let _ = B(&A as *const A);
    println!("End of main.");
}

This produces the following output:

Drop B.
Drop A.
End of main.

So far so good. Now let's try with let _b; the output is:

Drop A.
End of main.
Drop B.

As we can see, Drop B is printed after End of main.. This demonstrates that the B instance is alive until the end of the function, explaining the different destruction order.


Now, let's see what happens if we modify B to take a borrowed pointer with a lifetime instead of a raw pointer. Actually, let's go a step further and remove the Drop implementations for a moment:

struct A;
struct B<'a>(&'a A);

fn main() {
    let _ = B(&A);
}

This compiles fine. Behind the scenes, Rust assigns the same lifetime to both the A instance and the B instance (i.e. if we took a reference to the B instance, its type would be &'a B<'a> where both 'a are the exact same lifetime). When two values have the same lifetime, then necessarily we need to drop one of them before the other, and as mentioned above, the order is unspecified. What happens if we add back the Drop implementations?

struct A;
impl Drop for A { fn drop(&mut self) { println!("Drop A.") } }

struct B<'a>(&'a A);
impl<'a> Drop for B<'a> { fn drop(&mut self) { println!("Drop B.") } }

fn main() {
    let _ = B(&A);
}

Now we're getting a compiler error:

error: borrowed value does not live long enough
 --> <anon>:8:16
  |
8 |     let _ = B(&A);
  |                ^ does not live long enough
  |
note: reference must be valid for the destruction scope surrounding statement at 8:4...
 --> <anon>:8:5
  |
8 |     let _ = B(&A);
  |     ^^^^^^^^^^^^^^
note: ...but borrowed value is only valid for the statement at 8:4
 --> <anon>:8:5
  |
8 |     let _ = B(&A);
  |     ^^^^^^^^^^^^^^
help: consider using a `let` binding to increase its lifetime
 --> <anon>:8:5
  |
8 |     let _ = B(&A);
  |     ^^^^^^^^^^^^^^

Since both the A instance and the B instance have been assigned the same lifetime, Rust cannot reason about the destruction order of these objects. The error comes from the fact that Rust refuses to instantiate B<'a> with the lifetime of the object itself when B<'a> implements Drop (this rule was added as the result of RFC 769 before Rust 1.0). If it was allowed, drop would be able to access values that have already been dropped! However, if B<'a> doesn't implement Drop, then it's allowed, because we know that no code will try to access B's fields when the struct is dropped.



回答2:

Raw pointers themselves do not carry any sort of lifetime so the compiler might do something like this:

  1. Example:

    • B is created (so that it can hold an *const A in it)
    • A is created
    • B is not bound to a binding and thus gets dropped
    • A is not needed and thus gets dropped

Let's check out the MIR:

fn main() -> () {
    let mut _0: ();                      // return pointer
    let mut _1: B;
    let mut _2: *const A;
    let mut _3: *const A;
    let mut _4: &A;
    let mut _5: &A;
    let mut _6: A;
    let mut _7: ();

    bb0: {
        StorageLive(_1);                 // scope 0 at <anon>:8:13: 8:30
        StorageLive(_2);                 // scope 0 at <anon>:8:15: 8:29
        StorageLive(_3);                 // scope 0 at <anon>:8:15: 8:17
        StorageLive(_4);                 // scope 0 at <anon>:8:15: 8:17
        StorageLive(_5);                 // scope 0 at <anon>:8:15: 8:17
        StorageLive(_6);                 // scope 0 at <anon>:8:16: 8:17
        _6 = A::A;                       // scope 0 at <anon>:8:16: 8:17
        _5 = &_6;                        // scope 0 at <anon>:8:15: 8:17
        _4 = &(*_5);                     // scope 0 at <anon>:8:15: 8:17
        _3 = _4 as *const A (Misc);      // scope 0 at <anon>:8:15: 8:17
        _2 = _3;                         // scope 0 at <anon>:8:15: 8:29
        _1 = B::B(_2,);                  // scope 0 at <anon>:8:13: 8:30
        drop(_1) -> bb1;                 // scope 0 at <anon>:8:31: 8:31
    }

    bb1: {
        StorageDead(_1);                 // scope 0 at <anon>:8:31: 8:31
        StorageDead(_2);                 // scope 0 at <anon>:8:31: 8:31
        StorageDead(_3);                 // scope 0 at <anon>:8:31: 8:31
        StorageDead(_4);                 // scope 0 at <anon>:8:31: 8:31
        StorageDead(_5);                 // scope 0 at <anon>:8:31: 8:31
        drop(_6) -> bb2;                 // scope 0 at <anon>:8:31: 8:31
    }

    bb2: {
        StorageDead(_6);                 // scope 0 at <anon>:8:31: 8:31
        _0 = ();                         // scope 0 at <anon>:7:11: 9:2
        return;                          // scope 0 at <anon>:9:2: 9:2
    }
}

As we can see drop(_1) is indeed called before drop(_6) as presumed, thus you get the output above.

  1. Example

In this example B is bound to a binding

  • B is created (for the same reason as above)
  • A is created
  • A is not bound and gets dropped
  • B goes out of scope and gets dropped

The corresponding MIR:

fn main() -> () {
    let mut _0: ();                      // return pointer
    scope 1 {
        let _1: B;                       // "b" in scope 1 at <anon>:8:9: 8:10
    }
    let mut _2: *const A;
    let mut _3: *const A;
    let mut _4: &A;
    let mut _5: &A;
    let mut _6: A;
    let mut _7: ();

    bb0: {
        StorageLive(_1);                 // scope 0 at <anon>:8:9: 8:10
        StorageLive(_2);                 // scope 0 at <anon>:8:15: 8:29
        StorageLive(_3);                 // scope 0 at <anon>:8:15: 8:17
        StorageLive(_4);                 // scope 0 at <anon>:8:15: 8:17
        StorageLive(_5);                 // scope 0 at <anon>:8:15: 8:17
        StorageLive(_6);                 // scope 0 at <anon>:8:16: 8:17
        _6 = A::A;                       // scope 0 at <anon>:8:16: 8:17
        _5 = &_6;                        // scope 0 at <anon>:8:15: 8:17
        _4 = &(*_5);                     // scope 0 at <anon>:8:15: 8:17
        _3 = _4 as *const A (Misc);      // scope 0 at <anon>:8:15: 8:17
        _2 = _3;                         // scope 0 at <anon>:8:15: 8:29
        _1 = B::B(_2,);                  // scope 0 at <anon>:8:13: 8:30
        StorageDead(_2);                 // scope 0 at <anon>:8:31: 8:31
        StorageDead(_3);                 // scope 0 at <anon>:8:31: 8:31
        StorageDead(_4);                 // scope 0 at <anon>:8:31: 8:31
        StorageDead(_5);                 // scope 0 at <anon>:8:31: 8:31
        drop(_6) -> [return: bb3, unwind: bb2]; // scope 0 at <anon>:8:31: 8:31
    }

    bb1: {
        resume;                          // scope 0 at <anon>:7:1: 9:2
    }

    bb2: {
        drop(_1) -> bb1;                 // scope 0 at <anon>:9:2: 9:2
    }

    bb3: {
        StorageDead(_6);                 // scope 0 at <anon>:8:31: 8:31
        _0 = ();                         // scope 1 at <anon>:7:11: 9:2
        drop(_1) -> bb4;                 // scope 0 at <anon>:9:2: 9:2
    }

    bb4: {
        StorageDead(_1);                 // scope 0 at <anon>:9:2: 9:2
        return;                          // scope 0 at <anon>:9:2: 9:2
    }
}

As we can see drop(_6) does get called before drop(_1) so we get the behavior you have seen.