If I try to compile
for(;;)
{
}
System.out.println("End");
The Java compiler produces an error saying Unreachable statement
. But if I add another "unreachable"(according to me) break
statement and make it:
for(;;)
{
if(false) break;
}
System.out.println("End");
It compiles. Why does it not produce an error?
The behaviour is defined in the JLS description of unreachable statements:
The then-statement is reachable iff the if-then statement is reachable.
So the compiler determines that the then-statement (break;
) is reachable, regardless of the condition in the if
.
And a bit further, emphasis mine:
A basic for
statement can complete normally iff at least one of the following is true:
- The for statement is reachable, there is a condition expression, and the condition expression is not a constant expression (§15.28) with value true.
- There is a reachable
break
statement that exits the for statement.
So the for can complete normally because the then-statement contains a break
. As you noticed, it would not work if you replaced break
with return
.
The rationale is explained towards the end of the section. In substance, if
has a special treatment to allow constructs such as:
if(DEBUG) { ... }
where DEBUG may be a compile time constant.
As explained in my answer to a similar question, the specific construct if(compile-time-false)
is exempt from the unreachability rules as an explicit backdoor. In this case, the compiler treats your break
as reachable because of that.
From the JLS
An if-then statement can complete normally if at least one of the
following is true:
> The if-then statement is reachable and the condition expression is not
a constant expression whose value is true.
> The then-statement can complete normally.
So if(false)
is allowed.
This ability to "conditionally compile" has a significant impact on,
and relationship to, binary compatibility. If a set of classes
that use such a "flag" variable are compiled and conditional code is
omitted, it does not suffice later to distribute just a new version of
the class or interface that contains the definition of the flag. A
change to the value of a flag is, therefore, not binary compatible
with pre-existing binaries . (There are other reasons for
such incompatibility as well, such as the use of constants in case
labels in switch statements;)
Basically, unreachable code is detected by analyzing the program statically without actually running the code. While the condition will be checked at runtime. So, when this analysis takes place it does not actually look in to the condition but just check that break;
is accessible(reachable) via if
.
The core reason Java doesn't detect all unreachable statements is that it's generally impossible to answer whether the code is reachable or not. This follows from the fact that halting problem is undecidable over Turing machines.
So, it's clear that all unreachable statements cannot be detected, but why not to try evaluating conditions? Imagine now that the condition used is not just false
but something like ~x == x
. For example, all these statements will print true
for every int x
(source).
System.out.println((x + x & 1) == 0);
System.out.println((x + -x & 1) == 0);
System.out.println((-x & 1) == (x & 1));
System.out.println(((-x ^ x) & 1) == 0);
System.out.println((x * 0x80 & 0x56) == 0);
System.out.println((x << 1 ^ 0x1765) != 0);
The statements can be rather complicated; it takes time to resolve them. It would significantly increase build time, and after all, it will not detect all unreachable statements. Compiler was designed to take some efforts but not spend too much time for that.
The only question remained is: where to stop resolving conditions? The reasons for that don't seem to have mathematical justification and are based on usage scenario. Rationale for your particular case is given by JLS-14.21