rust - Rust 如何知道在堆栈展开期间是否运行析构函数？

Question

的文档mem::uninitialized指出了为什么使用该函数是危险/不安全的：调用drop未初始化的内存是未定义的行为。

所以我相信这段代码应该是未定义的：

let a: TypeWithDrop = unsafe { mem::uninitialized() };
panic!("=== Testing ==="); // Destructor of `a` will be run (U.B)

但是，我编写了这段代码，它在安全的 Rust 中工作，并且似乎没有受到未定义行为的影响：

#![feature(conservative_impl_trait)]

trait T {
    fn disp(&mut self);
}

struct A;
impl T for A {
    fn disp(&mut self) { println!("=== A ==="); }
}
impl Drop for A {
    fn drop(&mut self) { println!("Dropping A"); }
}

struct B;
impl T for B {
    fn disp(&mut self) { println!("=== B ==="); }
}
impl Drop for B {
    fn drop(&mut self) { println!("Dropping B"); }
}

fn foo() -> impl T { return A; }
fn bar() -> impl T { return B; }

fn main() {
    let mut a;
    let mut b;

    let i = 10;
    let t: &mut T = if i % 2 == 0 {
        a = foo();
        &mut a
    } else {
        b = bar();
        &mut b
    };

    t.disp();
    panic!("=== Test ===");
}

它似乎总是执行正确的析构函数，而忽略另一个。如果我尝试使用aor b（比如a.disp()代替t.disp()），它会正确错误地指出我可能正在使用未初始化的内存。令我惊讶的是，当panic国王时，它总是运行正确的析构函数（打印预期的字符串），无论值i是什么。

这是怎么发生的？如果运行时可以确定要运行哪个析构函数，是否应该Drop从上面链接的文档中删除强制需要为已实现的类型初始化内存的部分mem::uninitialized()？

score 21 · Accepted Answer

使用丢弃标志。

Rust（直到并包括版本 1.12）在其类型实现的每个值中存储一个布尔标志Drop（从而将该类型的大小增加一个字节）。该标志决定是否运行析构函数。因此，当您这样做时，b = bar()它会为变量设置标志b，因此只运行b's 析构函数。反之亦然a。

请注意，从 Rust 1.13 版开始（在撰写本文时是 beta 编译器），该标志不存储在类型中，而是存储在每个变量或临时变量的堆栈中。Rust 编译器中 MIR 的出现使这成为可能。MIR 显着简化了 Rust 代码到机器代码的转换，从而使该功能能够将丢弃标志移动到堆栈中。如果优化可以在编译时确定何时将删除哪个对象，则优化通常会消除该标志。

通过查看类型的大小，您可以在 1.12 版之前的 Rust 编译器中“观察”这个标志：

struct A;

struct B;

impl Drop for B {
    fn drop(&mut self) {}
}

fn main() {
    println!("{}", std::mem::size_of::<A>());
    println!("{}", std::mem::size_of::<B>());
}

分别在堆栈标志和堆栈标志之前打印0和。100

但是，使用mem::uninitialized仍然是不安全的，因为编译器仍然会看到对a变量的赋值并设置丢弃标志。因此，析构函数将在未初始化的内存上调用。请注意，在您的示例中，Dropimpl 不会访问您类型的任何内存（除了 drop 标志，但这对您来说是不可见的）。因此，您没有访问未初始化的内存（无论如何大小都是零字节，因为您的类型是零大小的结构）。据我所知，这意味着您的unsafe { std::mem::uninitialized() }代码实际上是安全的，因为之后不会发生内存不安全。

score 18 · Accepted Answer

There are two questions hidden here:

How does the compiler track which variable is initialized or not?
Why may initializing with mem::uninitialized() lead to Undefined Behavior?

Let's tackle them in order.

How does the compiler track which variable is initialized or not?

The compiler injects so-called "drop flags": for each variable for which Drop must run at the end of the scope, a boolean flag is injected on the stack, stating whether this variable needs to be disposed of.

The flag starts off "no", moves to "yes" if the variable is initialized, and back to "no" if the variable is moved from.

Finally, when comes the time to drop this variable, the flag is checked and it is dropped if necessary.

This is unrelated as to whether the compiler's flow analysis complains about potentially uninitialized variables: only when the flow analysis is satisfied is code generated.

Why may initializing with mem::uninitialized() lead to Undefined Behavior?

When using mem::uninitialized() you make a promise to the compiler: don't worry, I'm definitely initializing this.

As far as the compiler is concerned, the variable is therefore fully initialized, and the drop flag is set to "yes" (until you move out of it).

This, in turn, means that Drop will be called.

Using an uninitialized object is Undefined Behavior, and the compiler calling Drop on an uninitialized object on your behalf counts as "using it".

Bonus:

In my tests, nothing weird happened!

Note that Undefined Behavior means that anything can happen; anything, unfortunately, also includes "seems to work" (or even "works as intended despite the odds").

In particular, if you do NOT access the object's memory in Drop::drop (just printing), then it's very likely that everything will just work. If you do access it, however, you might see weird integers, pointers pointing into the wild, etc...

And if the optimizer is clever, even without accessing it, it might do weird things! Since we are using LLVM, I invite you to read What every C programmer should know about Undefined Behavior by Chris Lattner (LLVM's father).

score 3 · Accepted Answer

首先，有丢弃标志——用于跟踪哪些变量已被初始化的运行时信息。如果一个变量没有被赋值，drop()将不会为它执行。

在 stable 中，丢弃标志当前存储在类型本身中。向其写入未初始化的内存可能会导致未定义的行为是否drop()会被调用。这将很快成为过时的信息，因为 drop 标志在 nightly 中被移出类型本身。

在 nightly Rust 中，如果您将未初始化的内存分配给变量，则可以安全地假设该变量drop()将被执行。但是，任何有用的实现都drop()将对该值进行操作。无法检测类型是否在Drop特征实现中正确初始化：它可能导致尝试释放无效指针或任何其他随机事物，具体取决于Drop类型的实现。无论如何，将未初始化的内存分配给一个类型Drop是不明智的。

rust - Rust 如何知道在堆栈展开期间是否运行析构函数？

3 回答 3

Related

Reference