In many discussions about undefined behavior (UB), the point of view has been put forward that in the mere presence in a program of any construct that has UB in a program mandates a conforming implementation to do just anything (including nothing at all). My question is whether this should be taken in that sense even in those cases where the UB is associated to the execution of code, while the behaviour (otherwise) specified in the standard stipulates that the code in question should not be executed (and this possibly for specific input to the program; it might not be decidable at compile time).
Phrased more informally, does the smell of UB mandate a conforming implementation to decide that the whole program stinks, and refuse to execute correctly even the parts of the program for which the behaviour is perfectly well defined. An example program would be
#include <iostream>
int main()
{
int n = 0;
if (false)
n=n++; // Undefined behaviour if it gets executed, which it doesn't
std::cout << "Hi there.\n";
}
For clarity, I am assuming the program is well-formed (so in particular the UB is not associated to preprocessing). In fact I am willing to restrict to UB associated to "evaluations", which clearly are not compile-time entities. The definitions pertinent to the example given are, I think,(emphasis is mine):
Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread (1.10), which induces a partial order among those evaluations
The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either ... or a value computation using the value of the same scalar object, the behavior is undefined.
It is implicitly clear that the subjects in the final sentence, "side effect" and "value computation", are instances of "evaluation", since that is what the relation "sequenced before" is defined for.
I posit that in the above program, the standard stipulates that no evaluations occur for which the condition in the final sentence is satisfied (unsequenced relative to each other and of the described kind) and that therfore the program does not have UB; it is not erroneous.
In other words I am convinced that the answer to the question of my title is negative. However I would appreciate the (motivated) opinions of other people on this matter.
Maybe an additional question for those who advocate an affirmative answer, would that mandate that the proverbial reformatting of your hard drive might occur when an erroneous program is compiled?
Some related pointers on this site:
- Observable behavior and undefined behavior -- What happens if I don't call a destructor?
- Comments to this answer https://stackoverflow.com/a/24143792/1436796 (I do no longer stand absolutely with my answer itself)
- C++ What is the earliest undefined behavior can manifest itself?
- Difference between Undefined Behavior and Ill-formed, no diagnostic message required and its two answers, which represent opposite points of view
In the context of a safety-critical embedded system, the posted code would be considered defective:
It should be, if not "shall".
Behavior, by definition from ISO C (no corresponding definition found in ISO C++ but it should be still somehow applicable), is:
And UB:
WG21/N4527
Despite "to behaving during translation" above, the word "behavior" used by ISO C++ is mainly about the execution of programs.
WG21/N4527
It is clear the undefined behavior would be caused by specific language construct used wrongly or in a non-portable way (which is not conforming to the standard). However, the standard mention nothing about which specific portion of code in a program would cause it. In other words, "having undefined behavior" is the property (about conforming) of the whole program being executed, not any smaller parts of it.
The standard could have given a stronger guarantee to make the behavior well-defined once some specific code is not being executed, only when there exists a way to map the C++ code to the corresponding behavior precisely. This is hard (if not impossible) without a detailed semantic model about execution. In short, the operational semantics given by the abstract machine model above is not enough to achieve the stronger guarantee. But anyway, ISO C++ would never be JVMS or ECMA-335. And I don't expect there would be a complete set of formal semantics describing the language.
A key problem here is the meaning of "execution". Some people think "executing a program" means making the program being run. This is not quite true. Note the representation of program executed in the abstract machine is not specified. (Also note "this International Standard places no requirement on the structure of conforming implementations".) The code being executed here can be literally C++ code (not necessarily machine code or some other forms of intermediate code which is not specified by the standard at all). This effectively allows the core language to be implemented as an interpreter, an online partial evaluator or some other monsters translating C++ code on-the-fly. As a result, actually there is no way to split the phases of translation (defined by ISO C++ [lex.phases]) completely ahead of the process of execution without knowledge about specific implementations. Thus, it is necessary to allow UB occurring during the translation when it is too difficult to specify portable well-defined behavior.
Besides the problems above, perhaps for most ordinary users, one (non-technical) reason is enough: it is simply unnecessary to provide the stronger guarantee, allow bad code and defeat one of the (probable most important) usefulness aspect of UB itself: to encourage quickly throwing away some (unnecessarily) nonportable smelly code without effort to "fix" them which would be eventually in vain.
Additional notes:
Some words are copied and reconstructed from one of my reply to this comment.
A C compiler is allowed to do anything it likes as soon as a program enters a state via which there is no defined sequence of events which would allow the program to avoid invoking Undefined Behavior at some point in the future (note any loop which does not have any side-effects, and which does not have an exit condition which a compiler would be to required to recognize, invokes Undefined Behavior in and of itself). The compiler's behavior in such cases is bound by the laws of neither time nor causality. In situations where Undefined Behavior occurs in an expression whose result is never used, some compilers won't generate any code for the expression (so it will never "execute") but that won't prevent compilers from using the Undefined Behavior to make other inferences about program behavior.
For example:
Under the C current C standard, if the compiler could determinate that
disarm_missiles()
would always return without terminating but the three other external functions called above might terminate, the most efficient standard-compliant replacement for the statementfoo(-1);
(return value ignored) would beshould_launch_missiles(); arm_missiles(); should_launch_missiles(); launch_missiles();
.Program behavior will only be defined if either call to
should_launch_missiles()
terminates without returning, if the first call returns non-zero andarm_missiles()
terminates without returning, or if both calls return non-zero andlaunch_missiles()
terminates without returning. A program which works correctly in those cases will abide by the standard regardless of what it does in any other situation. If returning frommaybe_launch_missiles()
would cause Undefined Behavior, compiler would not be required to recognize the possibility that either call toshould_launch_missiles()
could return zero.As a consequence, some modern compilers, the effect of left-shifting a negative number may be worse than anything that could be caused by any kind of Undefined Behavior on a typical C99 compiler on platforms that separate code and data spaces and trap stack overflow. Even if code engaged in Undefined Behavior which could cause random control transfers, there would be no means by which it could cause
arm_missiles()
andlaunch_missiles()
to be called consecutively without having an intervening call todisarm_missiles()
unless at least one call toshould_launch_missiles()
returned a non-zero value. A hyper-modern compiler, however, may negate such protections.No. Example:
Side effects are changes in the state of the execution environment (1.9/12). A change is a change, not an expression that, if evaluated, would potentially produce a change. If there is no change, there is no side effect. If there is no side effect, then no side effect is unsequenced relative to anything else.
This does not mean that any code which is never executed is UB-free (though I'm pretty sure most of it is). Each occurrence of UB in the standard needs to be examined separately.(The stricken-out text is probably overly cautious; see below).The standard also says that
(emphasis mine)
This, as far as I can tell, is the only normative reference that says what the phrase "undefined behavior" means: an undefined operation in a program execution. No execution, no UB.
There's a clear divide between inherent undefined behaviour, such as n=n++, and code that can have defined or undefined behaviour depending on the program state at runtime, such as x/y for ints. In the latter case the program is required to work unless y is 0, but in the first case the compiler's asked to generate code that's totally illegitimate - it's within its rights to refuse to compile, it may just not be "bullet proofed" against such code and consequently its optimiser state (register allocations, records of which values may have been modified since read etc) gets corrupted resulting in bogus machine code for that and surrounding source code. It may be that early analysis recognised an "a=b++" situation and generated code for the preceding if to jump over a two byte instruction, but when n=n++ is encountered no instruction was output, such that the if statement jumps somewhere into the following opcodes. Anyway, it's simply game over. Putting an "if" in front, or even wrapping it in a different function, isn't documented as "containing" the undefined behaviour... bits of code aren't tainted with undefined behaviour - the Standard consistently says "the program has undefined behaviour".