Why is `i = ++i + 1` unspecified behavior?

2019-01-04 23:04发布

Consider the following C++ Standard ISO/IEC 14882:2003(E) citation (section 5, paragraph 4):

Except where noted, the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified. 53) Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined. [Example:

i = v[i++];  // the behavior is unspecified 
i = 7, i++, i++;  //  i becomes 9 

i = ++i + 1;  // the behavior is unspecified 
i = i + 1;  // the value of i is incremented 

—end example]

I was surprised that i = ++i + 1 gives an undefined value of i. Does anybody know of a compiler implementation which does not give 2 for the following case?

int i = 0;
i = ++i + 1;
std::cout << i << std::endl;

The thing is that operator= has two args. First one is always i reference. The order of evaluation does not matter in this case. I do not see any problem except C++ Standard taboo.

Please, do not consider such cases where the order of arguments is important to evaluation. For example, ++i + i is obviously undefined. Please, consider only my case i = ++i + 1.

Why does the C++ Standard prohibit such expressions?

15条回答
相关推荐>>
2楼-- · 2019-01-04 23:40

The problem here is that the standard allows a compiler to completely reorder a statement while it is executing. It is not, however, allowed to reorder statements (so long as any such reordering results in changed program behavior). Therefore, the expression i = ++i + 1; may be evaluated two ways:

++i; // i = 2
i = i + 1;

or

i = i + 1;  // i = 2
++i;

or

i = i + 1;  ++i; //(Running in parallel using, say, an SSE instruction) i = 1

This gets even worse when you have user defined types thrown in the mix, where the ++ operator can have whatever effect on the type the author of the type wants, in which case the order used in evaluation matters significantly.

查看更多
仙女界的扛把子
3楼-- · 2019-01-04 23:41

You make the mistake of thinking of operator= as a two-argument function, where the side effects of the arguments must be completely evaluated before the function begins. If that were the case, then the expression i = ++i + 1 would have multiple sequence points, and ++i would be fully evaluated before the assignment began. That's not the case, though. What's being evaluated in the intrinsic assignment operator, not a user-defined operator. There's only one sequence point in that expression.

The result of ++i is evaluated before the assignment (and before the addition operator), but the side effect is not necessarily applied right away. The result of ++i + 1 is always the same as i + 2, so that's the value that gets assigned to i as part of the assignment operator. The result of ++i is always i + 1, so that's what gets assigned to i as part of the increment operator. There is no sequence point to control which value should get assigned first.

Since the code is violating the rule that "between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression," the behavior is undefined. Practically, though, it's likely that either i + 1 or i + 2 will be assigned first, then the other value will be assigned, and finally the program will continue running as usual — no nasal demons or exploding toilets, and no i + 3, either.

查看更多
戒情不戒烟
4楼-- · 2019-01-04 23:41

The underlying reason is because of the way the compiler handles reading and writing of values. The compiler is allowed to store an intermediate value in memory and only actually commit the value at the end of the expression. We read the expression ++i as "increase i by one and return it", but a compiler might see it as "load the value of i, add one, return it, and the commit it back to memory before someone uses it again. The compiler is encouraged to avoid reading/writing to the actual memory location as much as possible, because that would slow the program down.

In the specific case of i = ++i + 1, it suffers largely due to the need of consistent behavioral rules. Many compilers will do the 'right thing' in such a situation, but what if one of the is was actually a pointer, pointing to i? Without this rule, the compiler would have to be very careful to make sure it performed the loads and stores in the right order. This rule serves to allow for more optimization opportunities.

A similar case is that of the so-called strict-aliasing rule. You can't assign a value (say, an int) through a value of an unrelated type (say, a float) with only a few exceptions. This keeps the compiler from having to worry that some float * being used will change the value of an int, and greatly improves optimization potential.

查看更多
登录 后发表回答