ASM Try/Catch Block with an output value

2019-05-26 17:12发布

问题:

I am currently trying make my custom compiler allow using try/catch as an expression, i.e. leaving a value on the stack. The type checker and the backend already support this, but the problem seems to be ASM's COMPUTE_FRAMES. With the below code for instrumentation:

private void write(MethodWriter writer, boolean expression)
{
    org.objectweb.asm.Label tryStart = new org.objectweb.asm.Label();
    org.objectweb.asm.Label tryEnd = new org.objectweb.asm.Label();
    org.objectweb.asm.Label endLabel = new org.objectweb.asm.Label();

    boolean hasFinally = this.finallyBlock != null;

    writer.writeLabel(tryStart);
    if (this.action != null)
    {
        if (expression && !hasFinally)
        {
            this.action.writeExpression(writer);
        }
        else
        {
            this.action.writeStatement(writer);
        }
        writer.writeJumpInsn(Opcodes.GOTO, endLabel);
    }
    writer.writeLabel(tryEnd);

    for (int i = 0; i < this.catchBlockCount; i++)
    {
        CatchBlock block = this.catchBlocks[i];
        org.objectweb.asm.Label handlerLabel = new org.objectweb.asm.Label();

        // Check if the block's variable is actually used
        if (block.variable != null)
        {
            // If yes register a new local variable for the exception and
            // store it.
            int localCount = writer.registerLocal();

            writer.writeLabel(handlerLabel);
            writer.writeVarInsn(Opcodes.ASTORE, localCount);
            block.variable.index = localCount;
            if (expression && !hasFinally)
            {
                block.action.writeExpression(writer);
            }
            else
            {
                block.action.writeStatement(writer);
            }
            writer.resetLocals(localCount);
        }
        // Otherwise pop the exception from the stack
        else
        {
            writer.writeLabel(handlerLabel);
            writer.writeInsn(Opcodes.POP);
            if (expression && !hasFinally)
            {
                block.action.writeExpression(writer);
            }
            else
            {
                block.action.writeStatement(writer);
            }
        }

        writer.writeTryCatchBlock(tryStart, tryEnd, handlerLabel, block.type.getInternalName());
        writer.writeJumpInsn(Opcodes.GOTO, endLabel);
    }

    if (hasFinally)
    {
        org.objectweb.asm.Label finallyLabel = new org.objectweb.asm.Label();

        writer.writeLabel(finallyLabel);
        writer.writeInsn(Opcodes.POP);
        writer.writeLabel(endLabel);
        if (expression)
        {
            this.finallyBlock.writeExpression(writer);
        }
        else
        {
            this.finallyBlock.writeStatement(writer);
        }
        writer.writeFinallyBlock(tryStart, tryEnd, finallyLabel);
    }
    else
    {
        writer.writeLabel(endLabel);
    }
}

Compiling this code:

System.out.println(try Integer.parseInt("10") catch (Throwable t) 10)

I get the following VerifyError upon class loading:

java.lang.VerifyError: Inconsistent stackmap frames at branch target 17
Exception Details:
  Location:
    dyvil/test/Main.main([Ljava/lang/String;)V @14: goto
  Reason:
    Current frame's stack size doesn't match stackmap.
  Current Frame:
    bci: @14
    flags: { }
    locals: { '[Ljava/lang/String;' }
    stack: { integer }
  Stackmap Frame:
    bci: @17
    flags: { }
    locals: { '[Ljava/lang/String;' }
    stack: { top, integer }
  Bytecode:
    0000000: b200 1412 16b8 001c a700 0957 100a a700
    0000010: 03b6 0024 b1                           
  Exception Handler Table:
    bci [3, 11] => handler: 11
  Stackmap Table:
    same_locals_1_stack_item_frame(@11,Object[#30])
    full_frame(@17,{Object[#38]},{Top,Integer})

Since I don't think that ASM has a problem computing the stack frames for try/catch blocks with an output value, is there a problem with my instrumentation code? (Note that ClassWriter.getCommonSuperclass, although it is not needed here, is correctly implemented.)

回答1:

Obviously, ASM can calculate stackmap frames for correct code only as no stackmap can fix broken code. We can learn what went wrong when we analyze the exception.

java.lang.VerifyError: Inconsistent stackmap frames at branch target 17

there is a branch targeting byte code position 17.

Exception Details:
  Location:
    dyvil/test/Main.main([Ljava/lang/String;)V @14: goto

the source of the branch is a goto instruction at position 14

  Reason:
    Current frame's stack size doesn't match stackmap.

quite self explaining. The only thing you have to consider that non-matching frames don’t necessarily indicate a wrong stackmap calculation; it might be that the bytecode itself is violates the constraints and the calculated stackmap just reflects that.

  Current Frame:
    bci: @14
    flags: { }
    locals: { '[Ljava/lang/String;' }
    stack: { integer }

at 14, the source of the branch (the location of the goto instruction), the stack contains one int value.

  Stackmap Frame:
    bci: @17
    flags: { }
    locals: { '[Ljava/lang/String;' }
    stack: { top, integer }

at 17, the target of the branch, are two values on the stack.

  Bytecode:
    0000000: b200 1412 16b8 001c a700 0957 100a a700
    0000010: 03b6 0024 b1                           

well, the bytecode isn’t disassembled here, but you can’t say the exception message was too brief up to this point. Manual disassembling the bytecode yields:

 0: getstatic     0x0014
 3: ldc           0x16
 5: invokestatic  0x001c
 8: goto          +9 (=>17)
11: pop
12: bipush        #10
14: goto          +3 (=>17)
17: invokevirtual 0x0024
20: return

 

  Exception Handler Table:
    bci [3, 11] => handler: 11

What we can see here is that there are two ways of reaching location 17, one is the ordinary execution of getstatic, ldc, invokestatic the other is the exception handler, starting at 11, performing pop bipush. We can deduce for the latter that it has indeed one int value on the stack as it pops the exception and pushes one int constant.

For the former, there is not enough information here, i.e. I don’t know the signature of the invoked method, however, since the verifier didn’t reject the goto from 8 to 17, it’s safe to assume that the stack indeed holds two values before the branch. Since getstatic, ldc produces two values, the static method must have either a void () or a value (value) signature. This implies that the value of the very first getstatic instruction is not used before the branch.

→After reading your comment, the error becomes apparent: that first getstatic instruction reads System.out which you want to use at the end of the method to invoke println, however, when an exception occurred, the stack is flushed and no PrintWriter is on the stack but the exception handler tries to recover and join the code path at the place where the PrintWriter is required for invoking println. It is important to understand that exception handlers always start with an operand stack consisting of a single element, the exception. None of the values you might have pushed before the exception occurred will persist. So if you want to prefetch a field value (like System.out) before the guarded code and use it regardless of whether an exception occurred, you have to store it in a local variable and retrieve afterwards.

It seems that ASM derived the stackmap frame for location @17 from the state before the first branch and when joining it with the frame of the state before the second branch, it only cared for the types but not the different depth, which is a pity as it’s an error that is easy to spot. But it’s only a missing feature (as COMPUTE_FRAMES is not specified to do error checking), not a bug.