Determining in the bytecode where is the super() m

2019-04-07 04:56发布

问题:

I was wondering if there's an obvious and quick way of when analyzing a constructor's bytecode, to determine where the super() code ends in.

More concretely, and in sharp contrast to Java, where a call in the constructor to any super() constructor method is optional (or rather, when not present -- implicit), in the bytecode world it is always needed.

For black magic purposes I'm in need of knowing just by bytecode analysis and by the simplest method available, what's the INVOKESPECIAL call that corresponds to the Java world's super() call.

I'll leave you here with a hard example:

public static class A {
    public A(Object o, Object b) {
    }
}

public static class B extends A {
    public B() {
        //the below super is in bold just to signal that's the one
        //I'm looking for
        SUPER(new A(new Object(), new Integer(2)), new Integer(1));
        System.out.println(new A(new Object(), new Integer(2)));
    }
}

with the corresponding bytecode:

回答1:

Actually, the rules for bytecode constructors are much more lax than Java's rules.

The only rule is that exactly one constructor must be called on any path that returns normally and if a constructor call throws an exception, then you must throw an exception too.

Among other things, this means that a constructor may contain multiple calls to other constructors or none at all.

Anyway, the only guaranteed way to determine whether a given invokespecial call is initializing the current object is to do a dataflow analysis, since it's possible to initialize other objects of the same class, which would confuse a naive detector.

Edit: Here is an example of a perfectly valid class (using the Krakatau assembler syntax), showing some of the issues you could run into. Among other things, it has calls to other constructors in the same class, recursive invocation of constructors, and constructing other objects of the same class inside the constructor.

.class public ctors
.super java/lang/Object

; A normal constructor
.method public <init> : ()V
    .limit locals 1
    .limit stack 1

    aload_0
    invokespecial java/lang/Object <init> ()V
    return
.end method

; A weird constructor
.method public <init> : (I)V
    .limit locals 2
    .limit stack 5

    iload_1
    ifne LREST
        aload_0
        invokespecial ctors <init> ()V
        return

LREST:
    aload_0
    new ctors
    iinc 1 -1
    iload_1
LFAKE_START:
    invokespecial ctors <init> (I)V
LFAKE_END:
    iconst_0
    invokespecial ctors <init> (I)V
    return

.catch [0] from LFAKE_START to LFAKE_END using LCATCH
LCATCH:
    aload_0
    invokespecial java/lang/Object <init> ()V
    return
.end method

.method public static main : ([Ljava/lang/String;)V
    .limit locals 1
    .limit stack 2

    new ctors
    iconst_5
    invokespecial ctors <init> (I)V
    return
.end method


回答2:

A simple solution is to count the number of new A object and the number of A.<init> When there is more init than new you have called the super constructor. You have to do the same check for new B and B.<init> in case this(...) is called.



回答3:

You have to find out at which invoke opcode the operand stack contains the this reference which will be used as the first argument. For this you just need to know about the effects on the operand stack that the different opcodes have. In your example you start with aload_0 (which is the this reference), then do quite a bit of magic above that reference (updating the operand stack all the time). After a while the invoke opcode you are looking for is there, which consumes the this reference (and some references for the arguments). This then is the super call.



回答4:

The answer to super() invocation is line no. 31.

I found it easy via eclipse's Class File editor. Have a look at the snap attached below.

One thing to remember here is, The prefix 'a' means that the opcode is manipulating an object reference. The prefix 'i' means the opcode is manipulating an integer.

So, the line by line explanation is as follows,

12  new java.lang.Integer //Create a new java.lang.Integer 
15  dup //Make a extra reference to the same Integer
16  iconst_2 // this means opcode is manipulating Integer as Integer(2)
17  invokespecial java.lang.Integer(int) //Integer(2) is invoked
20  invokespecial A(java.lang.Object, java.lang.Object) //new A(new Object(), new Integer(2) is invoked
23  new java.lang.Integer //Create a new java.lang.Integer
26  dup //Make a extra reference to the same Integer
27  iconst_1 // this means opcode is manipulating Integer as Integer(1)
28  invokespecial java.lang.Integer(int) //Integer(1) is invoked
31  invokespecial A(java.lang.Object, java.lang.Object) **//super(new A(new Object(), new Integer(2)), new Integer(1)) is invoked**

I hope the later is easy to comprehend. :)

56 - this invoke is for the sysout related A(object,object) invocation.



标签: java bytecode