Which class hierarchy differences can exist compar

2020-04-21 02:12发布

问题:

I'm currently generating some ASM code in a maven post compilation task. In Java 6 a StackMapTable was introduced representing the data types on the stack, which is mandatory in later versions. So I'm automatically determining the most specific class which can be on the stack. Now I ran into the problem that in my VM ThaiBuddhistDate and HijrahDate inherit from ChronoLocalDateImpl, so it would create this type in the StackMapTable, which would obviously crash in other VMs (maybe even versions). So I thought, maybe I should change the calculation to the minimum mandatory, which can cause (theoretically) similar problems for classes and interfaces. Now I'm trying to find a solution for my problem, so I have to figure out, which differences might occur.

Can an additional class only occur everywhere in the inheritance hierarchy? Assuming the JavaDoc has an inheritance hierarchy like:

Object - Foo - Bar - FooBar

Can I have additional classes in the inheritance structure everywhere?

Object - Baz - Foo - Bar - FooBar

Object - Foo - Baz - Bar - FooBar

Object - Foo - Bar - Baz - FooBar

Analogously for interfaces: Can interfaces also inherit from other interfaces, which are not defined in the documentation or can a class "only" have additional independent interfaces or interfaces, which are based on the defined ones or not even any?

回答1:

It seems you are using the COMPUTE_FRAMES option which will cause the ASM library to merge the types encountered through the possible code paths via getCommonSuperClass similar to what the old verifier did and somewhat perverting the concept of stack map tables.

As you have already noted, ASM’s implementation of getCommonSuperClass may return an actually inaccessible type, like a JRE internal base class, and ignores interface relationships. The bigger problem is that you can’t fix this with a different implementation of this method, as the information passed to this method is not sufficient to determine the right type.

The right type is what will be subsequently needed, which, of course, should also be compatible with what is provided via all possible code paths, which is what the verifier will/ought to check. If your code generator is designed in a way that it produces valid code, specifying the subsequently required types should be sufficient to create a valid stack map table entry, but the incoming types passed to getCommonSuperClass are not sufficient to tell you what will be the required type.

To illustrate the issue, consider the following example class

class Example {
    public static CharSequence problematicMethod() {
        return Math.random()>0.5? new StringBuilder("x"): new StringBuffer("y");
    }
}

and the following code analyzing the compiled (e.g. by javac) class and what ASM would generate by default when being told to recalculate stack map frames from scratch:

static void printFrame(int nLocal, Object[] local, int nStack, Object[] stack) {
    StringBuilder sb = decode(new StringBuilder().append("Locals: "), local, nLocal);
    System.out.println(decode(sb.append(", Stack: "), stack, nStack));
}
private static StringBuilder decode(StringBuilder sb, Object[] array, int num) {
    if(num==0) return sb.append("[]");
    sb.append('[');
    for(int ix = 0; ix<num; ix++) {
        Object o = array[ix];
        if(o==Opcodes.UNINITIALIZED_THIS) sb.append("this <uninit>");
        else if(o==Opcodes.INTEGER) sb.append("int");
        else if(o==Opcodes.FLOAT) sb.append("float");
        else if(o==Opcodes.DOUBLE) sb.append("double");
        else if(o==Opcodes.LONG) sb.append("long");
        else if(o==Opcodes.NULL) sb.append("null");
        else if(o==Opcodes.TOP) sb.append("-");
        else sb.append(Type.getObjectType(o.toString()).getClassName());
        sb.append(",");
    }
    sb.setCharAt(sb.length()-1, ']');
    return sb;
}
public static void main(String[] args) throws IOException {
    final MethodVisitor printFramesMV = new MethodVisitor(Opcodes.ASM5) {
        @Override public void visitFrame(int type, int nLocal, Object[] local, int nStack, Object[] stack) {
            printFrame(nLocal, local, nStack, stack);
        }
    };
    final ClassVisitor printFrames = new ClassVisitor(Opcodes.ASM5) {
        @Override
        public MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) {
            return name.equals("problematicMethod")? printFramesMV: null;
        }
    };
    ClassReader cr = new ClassReader(Example.class.getName());
    System.out.println("##original");
    cr.accept(printFrames, ClassReader.EXPAND_FRAMES);
    ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES);
    cr.accept(cw, ClassReader.SKIP_FRAMES);
    System.out.println("##from ASM");
    new ClassReader(cw.toByteArray()).accept(printFrames, ClassReader.EXPAND_FRAMES);
}

This will print

##original
Locals: [], Stack: []
Locals: [], Stack: [java.lang.CharSequence]
##from ASM
Locals: [], Stack: []
Locals: [], Stack: [java.lang.AbstractStringBuilder]

This shows the same problem you have explained in your question, ASM will generate a frame referring to an implementation specific class. The code generated by javac refers to the required type, which is compatible with the method’s return type. You could study StringBuilder and StringBuffer in getCommonSuperClass and find out that both implement CharSequence, but this is not sufficient to understand that CharSequence is the right type here, as we could simply change the example to

class Example {
    public static Appendable problematicMethod() {
        return Math.random()>0.5? new StringBuilder("x"): new StringBuffer("y");
    }
}

and get

##original
Locals: [], Stack: []
Locals: [], Stack: [java.lang.Appendable]
##from ASM
Locals: [], Stack: []
Locals: [], Stack: [java.lang.AbstractStringBuilder]

Since the incoming classes implement both interfaces, you can’t find out whether CharSequence or Appendable is the right merge type, just by looking at the incoming StringBuilder and StringBuffer types.

To evaluate this issue further, look at

class Example {
    public static Comparable problematicMethod() {
        return Math.random()>0.5? BigInteger.valueOf(123): Double.valueOf(1.23);
    }
}

which produces

##original
Locals: [], Stack: []
Locals: [], Stack: [java.lang.Comparable]
##from ASM
Locals: [], Stack: []
Locals: [], Stack: [java.lang.Number]

Here, ASM’s result is a public type, but this common base class doesn’t implement the required Comparable, so this code is actually broken.


It’s a great luck for all code generators using ASM’s COMPUTE_FRAMES option, that HotSpot’s verifier has a great tolerance towards interface types or, in other words, that it doesn’t verify the correctness of assignments at all (this includes the receivers of method invocations) when at least one of the two types is an interface.

If you want to generate code that survives verifiers strictly doing their job even for interfaces, you shouldn’t use that option and start to generate stack map frames yourself, by not using the COMPUTE_FRAMES option and emitting the right visitFrame calls (or inserting the appropriate nodes if you’re using the tree API).

There seem to be a widespread fear of doing that, but it’s not that complicated. As said before, it basically implies stating what your code generator already knows. It’s actually not about trying to find a common type, it’s about specifying what you will use afterwards and if your code generator is correct, that’s already it, but if not, ASM’s calculation can’t fix the code either.

To stay at your specific example, when dealing with ThaiBuddhistDate and HijrahDate you already know that you are processing them as ChronoLocalDate after the branch merge point (I suppose), whereas ASM ends up at an implementation specific non-public type, but if that type didn’t exist, ASM just used java.lang.Object as it doesn’t consider interfaces. If ASM considered interfaces, it had to decide between ChronoLocalDate and Serializable, neither being more specific than the other. It’s simply not solvable with this design.

To illustrate further, how different the outcome between “merge the incoming types” and “what will be used” can be, look at

class Example {
    public static void problematicMethod() {
        if(Math.random()>0.5) {
            java.awt.ScrollPane b = new java.awt.ScrollPane();
        }
        else {
            javax.swing.JTabbedPane t = new javax.swing.JTabbedPane();
        }
    }
}
##original
Locals: [], Stack: []
Locals: [], Stack: []
##from ASM
Locals: [], Stack: []
Locals: [java.awt.Container], Stack: []

Here, ASM wasted resources to find out the common base class in a deep class hierarchy tree, whereas just stating “drop the variable” would be sufficient…