Undefined behavior causing time travel

2019-01-19 18:33发布

One example of this article from a msdn blog made me ticker:

It says that this function:

void unwitting(bool door_is_open)
{
 if (door_is_open) {
  walk_on_in();
 } else {
  ring_bell();

  // wait for the door to open using the fallback value
  fallback = value_or_fallback(nullptr);
  wait_for_door_to_open(fallback);
 }
}

Can be optimized into this one:

void unwitting(bool door_is_open)
{
    walk_on_in();
}

Because calling value_or_fallback(nullptr) is undefined behavior (this is proven earlier in the article).

Now what I don’t understand is this: the run time enters undefined behavior only when it reaches that line. Shouldn’t the happen-before / happen-after concept applies here, in the sense that all observable effects of the first paragraph have be resolved before the run time enters UB?

3条回答
三岁会撩人
2楼-- · 2019-01-19 18:47

It's true that undefined behaviour may happen only at runtime (e.g. dereferencing a pointer which happens to be null). Other times, a program may statically be "ill-formed, no diagnostic required" (e.g. if you add an explicit specialization for a template after it has already been used), which has the same effect, though: You cannot argue from within the language how your program will behave.

Compilers can use UB to "optimize" code generation aggressively. In your case, the compiler sees that the second branch will cause UB (I assume that this is known statically, even though you didn't spell it out), and so it can assume further that that branch is never taken, since that's indistinguishable: If you did enter the second branch, then the behaviour would be undefined, and that includes behaving like you entered the first branch. So the compiler can simply consider the entire code path that leads to UB as dead and remove it.

There's no way for you to prove that something is wrong.

查看更多
相关推荐>>
3楼-- · 2019-01-19 18:57

There is a flow in the reasoning.

When a compiler writer says: we use Undefined Behavior to optimize a program, there are two different interpretations:

  • most people hear: we identify Undefined Behavior and decide we can do whatever we want (*)
  • the compiler writer meant: we assume Undefined Behavior does not occur

Thus, in your case:

  • dereferencing a nullptr is Undefined Behavior
  • thus executing value_or_fallback(nullptr) is Undefined Behavior
  • thus executing the else branch is Undefined Behavior
  • thus door_is_open being false is Undefined Behavior

And since Undefined Behavior does not occur (the programmer swears she will follow the terms of use), door_is_open is necessarily true and the compiler can elide the else branch.

(*) I am slightly annoyed that Raymond Chen actually formulated it this way...

查看更多
干净又极端
4楼-- · 2019-01-19 18:58

That's what happens when people try to translate common sense to a spec and then interpret the spec without the common sense. In my personal opinion this is completely wrong, but this is what is being done in the course of language standardization.

In my personal opinion, a compiler should not optimize code with undefined behavior. But the current post-modern compilers just optimize it out. And the standard permits both.

The logic behind the particular misbehavior that you mentioned is that the compiler operates on branches: if something is undefined in a branch, it marks the whole branch as having undefined behavior; and if a branch has undefined behavior, it may be replaced by anything.

The worst thing about this all is that new versions of the compiler may break (and do break) existing code -- either by not compiling it or compiling it to nonsense. And "existing code" typically is a really large amount of code.

查看更多
登录 后发表回答