Dynamic Java Bytecode Manipulation Framework Compa

2019-01-31 01:52发布

问题:

There are some frameworks out there for dynamic bytecode generation, manipulation and weaving (BCEL, CGLIB, javassist, ASM, MPS). I want to learn about them, but since I don't have much time to know all the details about all of them, I would like to see a sort of comparison chart saying the advantages and disadvantages of one versus the others and an explanation of why.

Here in SO, I found a lot of questions asking something similar, and the answers normally said "you can use cglib or ASM", or "javassist is better than cglib", or "BCEL is old and is dying" or "ASM is the best because it gives X and Y". These answers are useful, but does not fully answer the question in the scope that I want, comparing them more deeply and giving the advantages and disadvantages of each one.

回答1:

If your interest in bytecode generation is only to use it, the comparison chart becomes rather simple :

Do you need to understand bytecode?

for javassist : no

for all others : yes

Of course, even with javassist you may be confronted with bytecode concepts at some point. Likewise, some of the other libraries (such as ASM) have a more high-level api and/or tool support to shield you from many of the bytecode details.

What really distinguishes javassist though, is the inclusion of a basic java compiler. This makes it very easy to write complex class transformations : you only have to put a java fragment in a String and use the library to insert it at specific points in the program. The included compiler will build the equivalent bytecode, which is then inserted into the existing classes.



回答2:

First of all it all depends on your task. Do you want to generate the new code or analyze existing bytecode and how complex analysis you may need. Also how much time you want to invest into learning Java bytecode. You can break down bytecode framework into ones that provide a high level API, that allows you to get away from learning low level opcodes and JVM internals (e.g, javaassist and CGLIB) and low level frameworks when you need to understand JVM or use some bytecode generation tools (ASM and BCEL). For analyzis BCEL historically evolved a bit more, but ASM provides a decent functionality that is easy to extend. Also note, ASM is probably the only framework that provides the most advanced support for STACK_MAP information required by the new bytecode verifier enabled by default in Java 7.



回答3:

Analysis of bytecode libraries

As I can tell from the answers you got here and ones in the questions that you have looked at, these answers do not formally address the question in the explicit manner you have stated. You asked for a comparison, meanwhile these answers have vaguely stated what one might want based on what your target is (e.g. Do you need to know bytecode? [y/n]), or are too narrow.

This answer is a short analysis of each bytecode framework, and provides a quick comparison at the end.

Javassist

  • Tiny (javassist.jar (3.21.0) is ~707KB / javassist-rel_3_22_0_cr1.zip is ~1.5MB)
  • High(/Low)-level
  • Straightforward
  • Feature-complete
  • Requires minimal to no class file format knowledge
  • Requires moderate Java instruction set knowledge
  • Minimal learning effort
  • Has some quirks in the single-line/multi-line compile-and-insert-bytecode methods

I personally prefer Javassist simply because of how quickly you can get to using it and building and manipulating classes with it. The tutorial is straightforward and easy to follow. The jar file is a tiny 707KB, so it is nice and portable; makes it suitable for standalone applications.


ASM

  • Large (asm-6.0_ALPHA-bin.zip is ~2.9MB / asm-svn-latest.tar.gz (10/15/2016) is ~41MB)
  • Low(/High)-level
  • Comprehensive
  • Feature-complete
  • Recommend a proficient knowledge of class file format
  • Requires proficiency with Java instruction set
  • Moderate learning effort (somewhat complex)

ASM by ObjectWeb is a very comprehensive library which lacks nothing related to building, generating, and loading classes. In fact, it even has class analysis tools with predefined analyzers. It is said to be the industry standard for bytecode manipulation. It is also the reason why I steer clear away from it.

When I see examples of ASM, it seems like a cumbersome beast of a task with the number of lines it takes to modify or load a class. Even some of the parameters to some methods seem a bit cryptic and out of place for Java. With things like ACC_PUBLIC, and plenty of method calls with null everywhere, it honestly does look like it is better suited for a low-level language like C. Why not simply just pass a String literal like "public", or an enum Modifier.PUBLIC? It's more friendly and easy to use. That is my opinion, however.

For reference, here is an ASM (4.0) tutorial: https://www.javacodegeeks.com/2012/02/manipulating-java-class-files-with-asm.html


BCEL

  • Small (bcel-6.0-bin.zip is 7.3MB / bcel-6.0-src.zip is 1.4MB)
  • Low-level
  • Adequate
  • Gets the job done
  • Requires proficiency with Java instruction set
  • Easy to learn

From what I have seen, this library is your basic class library that lets you do everything you need to—if you can spare a few months or years.

Here is a BCEL tutorial that really spells it out: http://www.geekyarticles.com/2011/08/manipulating-java-class-files-with-bcel.html?m=1


cglib

  • Very tiny (cglib-3.2.5.jar is 295KB/source code)
  • Depends on ASM
  • High-level
  • Feature-complete (Bytecode Generation)
  • Little or no Java bytecode knowledge needed
  • Easy to learn
  • Esoteric Library

Despite the fact that you can read information from classes, and that you can transform classes, the library seems tailored to proxies. The tutorial is all about beans for the proxies, and it even mentions it is used by "data access frameworks to generate dynamic proxy objects and intercept field access." Still, I see no reason why you can't use it for the more simple purpose of bytecode manipulation instead of proxies.


ByteBuddy

  • Small bin/"Huge" src (by comparison) (byte-buddy-dep-1.8.12.jar is ~2.72 MB / 1.8.12 (zip) is 124.537 MB (exact))
  • Depends on ASM
  • High-level
  • Feature-complete
  • Personally, a peculiar name for a Service Pattern class (ByteBuddy.class)
  • Little or no Java byte code knowledge needed
  • Easy to learn

Long story short, where BCEL is lacking, ByteBuddy is abundant. It uses a primary class called ByteBuddy using the Service Design Pattern. You create a new instance of ByteBuddy, and this represents a class that you want to modify. When you are done with your modifications, you can then make a DynamicType with make().

On their website is a full tutorial with API documentation. The purpose seems to be for rather high-level modifications. When it comes to methods, there does not appear to be anything in the official tutorial, or any 3rd party tutorial, about creating a method from scratch, apart from delegating a method (EDITME if you know where this is explained).

Their tutorial can be found here on their website. Some examples can be found here.

Warning: opinion ahead

This is an excellent library. The only nitpick I have is creating instances of a class called ByteBuddy to modify or create classes. Personally, I would redesign this to have a class called ByteBuddy as a static utility class that can be instantiated, and have a class called ClassData or EmptyClass that you give to a ByteBuddy instance for modification so that you can separate the functionality of ByteBuddy from the data it modifies or creates. I did this with jCLA using ClassBuilder as a means to modify ClassDefinitions (immutable data-only class) except that ClassBuilder copies all the data from a ClassDefinition to modify it, and the build() method creates a new, immutable ClassDefinition when you are finished with your modifications, and to update the old class, you must explicitly invoke ClassPool#update(String) where String is the fully qualified class name (my.package.MyClass).

Therefore, instead of:

DynamicType.Unloaded<?> dynamicType = new ByteBuddy()
  .subclass(Object.class)
  .name("example.Type")
  .make();

You would have:

ByteBuddy service = new ByteBuddy();
DynamicType.Unloaded<?> dynamicType = service.editing(ByteBuddy.emptyClass())
  .subclass(Object.class)
  .name("example.Type")
  .getCurrentClass();

but that's just me.


Java Class Assistant (jCLA)

I have my own bytecode library that I am building, which will be called Java Class Assistant, or jCLA for short, because of another project I am working on and because of said quirks with Javassist, but I will not be releasing it to GitHub until it is finished but the project is currently available to browse on GitHub and give feedback on as it is currently in alpha, but still workable enough to be a basic class library (currently working on the compilers; please help me if you can! It will be released a lot sooner!).

It will be quite bare bones with the ability to read and write class files to and from a JAR file, as well as the ability to compile and decompile bytecode to and from source code and class files.

The overall usage pattern makes it rather easy to work with jCLA, though it may take some getting used to and is apparently quite similar to ByteBuddy in its style of methods and method parameters for class modifications:

import jcla.ClassPool;
import jcla.ClassBuilder;
import jcla.ClassDefinition;
import jcla.MethodBuilder;
import jcla.FieldBuilder;

import jcla.jar.JavaArchive;

import jcla.classfile.ClassFile;

import jcla.io.ClassFileOutputStream;

public class JCLADemo {

    public static void main(String... args) {
        // get the class pool for this JVM instance
        ClassPool classes = ClassPool.getLocal();
        // get a class that is loaded in the JVM
        ClassDefinition classDefinition = classes.get("my.package.MyNumberPrinter");
        // create a class builder to modify the class
        ClassBuilder clMyNumberPrinter= new ClassBuilder(classDefinition);

        // create a new method with name printNumber
        MethodBuilder printNumber = new MethodBuilder("printNumber");
        // add access modifiers (use modifiers() for convenience)
        printNumber.modifier(Modifier.PUBLIC);
        // set return type (void)
        printNumber.returns("void");
        // add a parameter (use parameters() for convenience)
        printNumber.parameter("int", "number");
        // set the body of the method (compiled to bytecode)
        // use body(byte[]) or insert(byte[]) for bytecode
        // insert(String) also compiles to bytecode
        printNumber.body("System.out.println(\"the number is: \" + number\");");
        // add the method to the class
        // you can use method(MethodDefinition) or method(MethodBuilder)
        clMyNumberPrinter.method(printNumber.build());

        // add a field to the class
        FieldBuilder HELLO = new FieldBuilder("HELLO");
        // set the modifiers for hello; convenience method example
        HELLO.modifiers(Modifier.PRIVATE, Modifier.STATIC, Modifier.FINAL);
        // set the type of this field
        HELLO.type("java.lang.String");
        // set the actual value of this field
        // this overloaded method expects a VariableInitializer production
        HELLO.value("\"Hello from \" + getClass().getSimpleName() + \"!\"");

        // add the field to the class (same overloads as clMyNumberPrinter.method())
        clMyNumberPrinter.field(HELLO.build());

        // redefine
        classDefinition = clMyNumberPrinter.build();
        // update the class definition in the JVM's ClassPool
        // (this updates the actual JVM's loaded class)
        classes.update(classDefinition);

        // write to disk
        JavaArchive archive = new JavaArchive("myjar.jar");
        ClassFile classFile = new ClassFile(classDefinition);
        ClassFileOutputStream stream = new ClassFileOutputStream(archive);

        try {
            stream.write(classFile);
        } catch(IOException e) {
            // print to System.out
        } finally {
            stream.close();
        }
    }

}

(VariableInitializer production specification for your convenience.)

As may be implied from the above snippet, each ClassDefinition is immutable. This makes jCLA more secure, thread-safe, network-safe, and easy to use. The system revolves primarily around ClassDefinitions as the object of choice for querying information about a class in a high-level manner, and the system is built in such a way that ClassDefinition is converted to and from target types such as ClassBuilder and ClassFile.

jCLA uses a tiered system for class data. At the bottom, you have the immutable ClassFile: a struct or software representation of a class file. Then you have immutable ClassDefinitions which are converted from ClassFiles into something less cryptic and more manageable and useful to the programmer who is modifying or reading data from the class, and is comparable to information accessed through java.lang.Class. Finally, you have mutable ClassBuilders. The ClassBuilder is how classes are modified or created. It allows that you can create a ClassDefinition directly from the builder from its current state. Creating a new builder for each class is not necessary as the reset() method will clear the variables.

(Analysis of this library will be available as soon as it is ready for release.)

But until then, as of today:

  • Small (src: 227.704 KB exact, 6/2/2018)
  • Self-sufficient (no dependencies except Java's shipped library)
  • High-level
  • No required knowledge of java bytecode or class files (for tier 1 API, e.g. ClassBuilder, ClassDefinition, etc.)
  • Easy to learn (even easier if coming from ByteBuddy)

I still recommend learning about java bytecode however. It will make debugging easier.


Comparison

Considering all of these analyses (excluding jCLA for now), the broadest framework is ASM, the easiest to use is Javassist, the simplest implementation is BCEL, and the most high-level for bytecode generation and proxies is cglib.

Note: if I missed a Library, please edit this answer or mention it in a comment.