Could a final variable be reassigned in catch, eve

2019-01-13 07:04发布

问题:

I am quite convinced that here

final int i;
try { i = calculateIndex(); }
catch (Exception e) { i = 1; }

i cannot possibly have already been assigned if control reaches the catch-block. However, Java compiler disagrees and claims the final local variable i may already have been assigned.

Is there still some subtlety I am missing here, or is this just a weakness of the model used by the Java Language Specification to identify potential reassignments? My main worry are things like Thread.stop(), which may result in an exception being thrown "out of thin air," but I still don't see how it could be thrown after the assignment, which is apparently the very last action within the try-block.

The idiom above, if allowed, would make many of my methods simpler. Note that this use case has first-class support in languages, such as Scala, which consistently employ the Maybe monad:

final int i = calculateIndex().getOrElse(1);

I think this use case serves as a quite good motivation to allow that one special case where i is definitely unassigned within the catch-block.

UPDATE

After some thought I am even more certain that this is just a weakness of the JLS model: if I declare the axiom "in the presented example, i is definitely unassigned when control reaches the catch-block", it will not conflict with any other axiom or theorem. The compiler will not allow any reading of i before it is assigned in the catch-block, so the fact whether i has been assigned to or not cannot be observed.

回答1:

JLS hunting:

It is a compile-time error if a final variable is assigned to unless it is definitely unassigned (§16) immediately prior to the assignment.

Quoth chapter 16:

V is definitely unassigned before a catch block iff all of the following conditions hold:

V is definitely unassigned after the try block.
V is definitely unassigned before every return statement that belongs to the try block.
V is definitely unassigned after e in every statement of the form throw e that belongs to the try block.
V is definitely unassigned after every assert statement that occurs in the try block.
V is definitely unassigned before every break statement that belongs to the try block and whose break target contains (or is) the try statement.
V is definitely unassigned before every continue statement that belongs to the try block and whose continue target contains the try statement.

Bold is mine. After the try block it is unclear whether i is assigned.

Furthermore in the example

final int i;
try {
    i = foo();
    bar();
}
catch(Exception e) { // e might come from bar
    i = 1;
}

The bold text is the only condition preventing the actual erroneous assignment i=1 from being illegal. So this is sufficient to prove that a finer condition of "definitely unassigned" is necessary to allow the code in your original post.

If the spec were revised to replace this condition with

V is definitely unassigned after the try block, if the catch block catches an unchecked exception.
V is definitely unassigned before the last statement capable of throwing an exception of a type caught by the catch block, if the catch block catches an unchecked exception.

Then I believe your code would be legal. (To the best of my ad-hoc analysis.)

I submitted a JSR for this, which I expect to be ignored but I was curious to see how these are handled. Technically fax number is a required field, I hope it won't do too much damage if I entered +1-000-000-000 there.



回答2:

I think the JVM is, sadly, correct. While intuitively correct from looking at the code, it makes sense in the context of looking at the IL. I created a simple run() method that mostly mimics your case (simplified comments here):

0: aload_0
1: invokevirtual  #5; // calculateIndex
4: istore_1
5: goto  17
// here's the catch block
17: // is after the catch

So, while you can't easily write code to test this, because it won't compile, the invoke of the method, the store the value, and the skip to after the catch are three separate operations. You could (however unlikely that may be) have an exception occur (Thread.interrupt() seems to be the best example) between step 4 and step 5. This would result in entering into the catch block after i has been set.

I'm not sure you could intentionally make that happen with a ton of threads and interrupts (and the compiler won't let you write that code anyway), but it is thus theoretically possible that i could be set, and you could enter in the exception handling block, even with this simple code.



回答3:

Not quite as clean (and I suspect what you are already doing). But this only adds 1 extra line.

final int i;
int temp;
try { temp = calculateIndex(); }
catch (IOException e) { temp = 1; }
i = temp;


回答4:

This is a summary of the strongest arguments in favor of the thesis that the current rules for definite assignment cannot be relaxed without breaking consistency (A), followed by my counterarguments (B):

  • A: on the bytecode level the write to the variable is not the last instruction within the try-block: for example, the last instruction will typically be a goto jump over the exception handling code;

  • B: but if the rules state that i is definitely unassigned within the catch-block, its value may not be observed. An unobservable value is as good as no value;

  • A: even if the compiler declares i as definitely unassigned, a debug tool could still see the value;

  • B: in fact, a debug tool could always access an uninitialized local variable, which will on a typical implementation have any arbitrary value. There is no essential difference between an uninitialized variable and a variable whose initialization completed abruptly after the actual write having occurred. Regardless of the special case under consideration here, the tool must always use additional metadata to know for each local variable the range of instructions where that variable is definitely assigned and only allow its value to be observed while execution finds itself within the range.

Final Conclusion:

The specification could consistently receive more fine-grained rules which would allow my posted example to compile.



回答5:

You are correct that if the assignment is the very last operation in the try block, we know that upon entering the catch block the variable will not have been assigned. However, formalizing the notion of "very last operation" would add significant complexity to the spec. Consider:

try {
    foo = bar();
    if (foo) {
        i = 4;
    } else {
        i = 7;
    }
}

Would that feature be useful? I don't think so, because a final variable must be assigned exactly once, not at most once. In your case, the variable would be unassigned if an Error is thrown. You may not care about that if the variable runs out of scope anyway, but such is not always the case (there could be another catch block catching the Error, in the same or a surrounding try statement). For instance, consider:

final int i;
try {
    try {
        i = foo();
    } catch (Exception e) {
        bar();
        i = 1;
    }
} catch (Throwable t) {
    i = 0;
}

That is correct, but wouldn't be if the call to bar() occured after assigning i (such as in the finally clause), or we use a try-with-resources statement with a resource whose close method throws an exception.

Accounting for that would add even more complexity to the spec.

Finally, there is a simple work around:

final int i = calculateIndex();

and

int calculateIndex() {
    try {
        // calculate it
        return calculatedIndex;
    } catch (Exception e) {
        return 0;
    }
}

that makes it obvious that i is assigned.

In short, I think that adding this feature would add significant complexity to the spec for little benefit.



回答6:

1   final int i;
2   try { i = calculateIndex(); }
3   catch (Exception e) { 
4       i = 1; 
5   }

OP already remarks that at line 4 i may already have been assigned. For example through Thread.stop(), which is an asynchronous exception, see http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-2.html#jvms-2.5

Now set a breakpoint at line 4 and you can observe the state of the variable i before 1 is assignd. So loosening the observed behaviour would go against the Java™ Virtual Machine Tool Interface



回答7:

Browsing the javadoc, it seems no subclass of Exception could be thrown just after i is assigned. From a JLS theoretical perspective, it seems Error could be thrown just after i is assigned (e.g. VirtualMachineError).

Seems there's no JLS requirement for compiler to determine whether i could be previously set when catch block is reached, by distinguishing whether you're catching Exception or Error/Throwable, implying it's a weakness of JLS model.

Why not try the following? (have compiled & tested)

(Integer Wrapper Type + finally + "Elvis" operator to test whether null):

import myUtils.ExpressionUtil;
....
Integer i0 = null; 
final int i;
try { i0 = calculateIndex(); }   // method may return int - autoboxed to Integer!
catch (Exception e) {} 
finally { i = nvl(i0,1); }       


package myUtils;
class ExpressionUtil {
    // Custom-made, because shorthand Elvis operator left out of Java 7
    Integer nvl(Integer i0, Integer i1) { return (i0 == null) ? i1 : i0;}
}


回答8:

I think there is one situation where this model act as life saver. Consider the code given below:

final Integer i;
try
{
    i = new Integer(10);----->(1)
}catch(Exception ex)
{
    i = new Integer(20);
}

Now Consider the line (1). Most of the JIT compilers creates object in following sequence(psuedo code):

mem = allocate();   //Allocate memory 
ctorInteger(instance);//Invoke constructor for Singleton passing instance.
i = mem;        //Make instance i non-null

But, some JIT compilers does out of order writes. And above steps is reordered as follows:

mem = allocate();   //Allocate memory 
i = mem;        //Make instance i non-null
ctorInteger(instance);  //Invoke constructor for Singleton passing instance.

Now suppose, the JIT performs out of order writes while creating the object in line (1). And suppose an exception is thrown while executing the constructor. In that case, the catch block will have i which is not null . If JVM doesn't follow this modal then in this case final variable is allowed to be assigned twice!!!



回答9:

But i may be assigned twice

    int i;
    try {
        i = calculateIndex();  // suppose func returns true
        System.out.println("i=" + i);
        throw new IOException();
    } catch (IOException e) {
        i = 1;
        System.out.println("i=" + i);
    }

output

i=0
i=1

and it means it cannot be final



回答10:

EDITING RESPONSE BASED UPON QUESTIONS FROM OP

This is really in response to the comment:

All you have done is written up a clear-cut example of a straw man argument: you are vicariously introducing the tacit assumption that there must always be one and only one default value, valid for all call sites

I believe that we are approaching the entire question from opposite ends. It seems that you are looking at it from the bottom up - literally from the bytecode and going up to the Java. If this is not true, you are looking at it from the "code" compliance to the spec.

Approaching this from the opposite direction, from the "design" down, I see problems. I think it was M. Fowler who collected various "bad smells" into the book: "Refactoring: Improving the Design of Existing Code". Here (and probably many, many other places) the "Extract Method" refactoring is described.

Thus, if I imagine a made-up version of your code without the 'calculateIndex' method, I might have something like this:

public void someMethod() {
    final int i;
    try {
        int intermediateVal = 35;
        intermediateVal += 56;
        i = intermediateVal*3;
    } catch (Exception e) {
        // would like to be able to set i = 1 here;
    }
}

Now, the above COULD have been refactored as originally posted with a 'calculateIndex' method. However, if the 'Extract Method' Refactoring defined by Fowler is completely applied, then one gets this [note: dropping the 'e' is intentional to differentiate from your method.]

public void someMethod() {
    final int i =  calculateIndx();
}

private int calculateIndx() {
    try {
        int intermediateVal = 35;
        intermediateVal += 56;
        return intermediateVal*3;
    } catch (Exception e) {
        return 1;  // or other default values or other way of setting
    }
}

So from the 'design' perspective the problem is the code you have. Your 'calculateIndex' method does NOT calculate the index. It only does sometimes. The rest of the time, the exception handler does the calculation.

Furthermore, this refactoring is far more accommodating to changes. For instance, if you have to change what I assumed was the default value of '1' to a '2', no big deal. However, as pointed out by the OP reply quoted, one cannot assume that there is only one default value. If the logic to set this grows to be only slightly complex it could still easily reside in the encapsulated exception handler. However, at some point, it too may need to be refactored into it's own method. Both cases still allow the encapsulated method to perform it's function and truly calculate the index.

In summary, when I get here and look at what I believe is the correct code, then there is no compiler issue for discussion. (I am most certain you will not agree: that is fine, I just want to be clearer about my viewpoint.) As for compiler warnings that come up for incorrect code, those help me realize in the first place that something is wrong. In this case, that refactoring is needed.



回答11:

As per specs JLS hunting done by "djechlin", specs tells when is the variable definitely unassigned. So spec says that in those scenarios it is safe to allow the assignment.There can be scenarios other than the one mentioned in the specs in which case variable can still be unassigned and it will depend on compiler to make that intelligent decision if it can detect and allow an assignment.

Spec in no way mentions in the scenario specified by you, that compiler should flag an error. So it depends on compiler implementation of spec if it is intelligent enough to detect such scenarios.

Reference: Java Language Specification Definite Assignment section "16.2.15 try Statements"



回答12:

I faced EXACTLY the same problem Mario, and read this very interresting discussion. I just solved my issue by that:

private final int i;

public Byte(String hex) {
    int calc;
    try {
        calc = Integer.parseInt(hex, 16);
    } catch (NumberFormatException e) {
        calc = 0;
    }
    finally {
      i = calc;
    }
}

@Joeg, I must admit that I liked a lot your post about design, especially that sentence: calculateIndx() calculates sometimes the index, but could we say the same about parseInt() ? Isn't that also the role of calculateIndex() to throw and thus not calculate the index when it is not possible, and then making it returning a wrong value (1 is arbitrary in your refactoring) is imho bad.

@Marko, I didn't understand your reply to Joeg about the AFTER line 4 and BEFORE line 5... I'm not strong enough yet in java world (25y of c++ but only 1 in java...), but I thing this case is one where the compiler is right : i could be initialized twice in Joeg's case.

[All what I'm saying is a very very humble opinion]