How to correct a mess java .class file set or gene

2019-09-03 08:36发布

问题:

Background

I have to contact different kind of Java project with various of build system. Sometimes the directory structure is different from the package hierarchy. So it is difficult to package.

Even if the build system like Maven and Gradle have its own function to pack .jar but it need a qualified internet connection and a giant size of local repository. Hence I usually build the library I need on the desktop in my office. However I spend more time on my laptop which doesn't have such good connection and storage.

Question

Is there a stable, compatible and secure way to make .jar package? Or move those .class file to appropriate directory? (An official and mature open source tool is better)

I am a beginner of Java Language, so any point may be helpful for me.

In order to make it clearly, here I put some examples.

Case 1: If there is a tool which can moving (moving or copying) the .class file in correct directory it should behave as below:

Before operation:

from_root/
├── a.class
├── b.class
├── c.class
├── d.class
├── e.class
└── f.class

Note in each source file(.java) of .class file, there is package announcement(like package org.hello.world; or package org.hello;)

when typing the_tool ./from_root ./to_root in shell

it will become:

to_root/
└── org
    └── hello
        ├── a.class
        ├── b.class
        └── world
            ├── c.class
            ├── d.class
            ├── e.class
            └── f.class

Case 2: If there is a tool which can packing correct .jar file from mess .class file directory it should behave as below:

Before operation:

from_root/
├── a.class
├── b.class
├── c.class
├── d.class
├── e.class
└── f.class

when typing pack_tool ./from_root ./to_root/pack.jar in shell

A pack.jar will be generate in ./to_root.

When add it in Java Build Path of Eclipse, it should be imported and called properly, instead of wrong name-space hierarchy .

What I have tried

For some famous 3rd part library (e.g. apache tika)

It is build by maven, and I typed cd /path/to/tika-1.18-src/tika-1.18/ then I typed mvn package.

Unfortunately, it was failed.

However when I typed mvn compile, everything went right.

Then in order to build .jar package, I typed jar cvf ./build/tika-1.18.jar $(find ./ -name org | grep target) in terminal.

However, the structure inside the .jar package was totally wrong. I couldn't use it in my project.

Then I tried find ./ -name org | grep target | parallel cp {} ./build/ -R -f then cd ./build and finally jar cvf ./tika-1.18.jar org.

It worked. However there is some disadvantage of this method.

Shortcoming:

  1. If the some source file(.java file)'s name include 'target', it will cause big trouble.

  2. If different subproject contain same file it will cause conflict. For example here is the conflict information: cp: cannot create directory './build/org/apache/tika/batch': File exists cp: cannot create directory './build/org/apache/tika/language/translate': File exists

  3. This method can only handle the situation which the path of those .class file is partly correct. If all the .class file are located in same directory flatly, it will failed to work.

Another attemption

I tried to put the .class file in correct directory according to its binary content.

For example, org.apache.tika.detect.AutoDetectReader

$ hexdump -C /path/to/here/EncodingDetector.class
00000000  ca fe ba be 00 00 00 33  00 0e 07 00 0a 07 00 0b  |.......3........|
00000010  07 00 0c 01 00 06 64 65  74 65 63 74 01 00 54 28  |......detect..T(|
00000020  4c 6a 61 76 61 2f 69 6f  2f 49 6e 70 75 74 53 74  |Ljava/io/InputSt|
00000030  72 65 61 6d 3b 4c 6f 72  67 2f 61 70 61 63 68 65  |ream;Lorg/apache|
00000040  2f 74 69 6b 61 2f 6d 65  74 61 64 61 74 61 2f 4d  |/tika/metadata/M|
00000050  65 74 61 64 61 74 61 3b  29 4c 6a 61 76 61 2f 6e  |etadata;)Ljava/n|
00000060  69 6f 2f 63 68 61 72 73  65 74 2f 43 68 61 72 73  |io/charset/Chars|
00000070  65 74 3b 01 00 0a 45 78  63 65 70 74 69 6f 6e 73  |et;...Exceptions|
00000080  07 00 0d 01 00 0a 53 6f  75 72 63 65 46 69 6c 65  |......SourceFile|
00000090  01 00 15 45 6e 63 6f 64  69 6e 67 44 65 74 65 63  |...EncodingDetec|
000000a0  74 6f 72 2e 6a 61 76 61  01 00 27 6f 72 67 2f 61  |tor.java..'org/a|
000000b0  70 61 63 68 65 2f 74 69  6b 61 2f 64 65 74 65 63  |pache/tika/detec|
000000c0  74 2f 45 6e 63 6f 64 69  6e 67 44 65 74 65 63 74  |t/EncodingDetect|
000000d0  6f 72 01 00 10 6a 61 76  61 2f 6c 61 6e 67 2f 4f  |or...java/lang/O|
000000e0  62 6a 65 63 74 01 00 14  6a 61 76 61 2f 69 6f 2f  |bject...java/io/|
000000f0  53 65 72 69 61 6c 69 7a  61 62 6c 65 01 00 13 6a  |Serializable...j|
00000100  61 76 61 2f 69 6f 2f 49  4f 45 78 63 65 70 74 69  |ava/io/IOExcepti|
00000110  6f 6e 06 01 00 01 00 02  00 01 00 03 00 00 00 01  |on..............|
00000120  04 01 00 04 00 05 00 01  00 06 00 00 00 04 00 01  |................|
00000130  00 07 00 01 00 08 00 00  00 02 00 09              |............|
0000013c

In order to compare I used another .class file org.apache.tika.embedder.Embedder

$ hexdump -C ./Embedder.class 
00000000  ca fe ba be 00 00 00 33  00 14 07 00 0f 07 00 10  |.......3........|
00000010  07 00 11 01 00 16 67 65  74 53 75 70 70 6f 72 74  |......getSupport|
00000020  65 64 45 6d 62 65 64 54  79 70 65 73 01 00 36 28  |edEmbedTypes..6(|
00000030  4c 6f 72 67 2f 61 70 61  63 68 65 2f 74 69 6b 61  |Lorg/apache/tika|
00000040  2f 70 61 72 73 65 72 2f  50 61 72 73 65 43 6f 6e  |/parser/ParseCon|
00000050  74 65 78 74 3b 29 4c 6a  61 76 61 2f 75 74 69 6c  |text;)Ljava/util|
00000060  2f 53 65 74 3b 01 00 09  53 69 67 6e 61 74 75 72  |/Set;...Signatur|
00000070  65 01 00 58 28 4c 6f 72  67 2f 61 70 61 63 68 65  |e..X(Lorg/apache|
00000080  2f 74 69 6b 61 2f 70 61  72 73 65 72 2f 50 61 72  |/tika/parser/Par|
00000090  73 65 43 6f 6e 74 65 78  74 3b 29 4c 6a 61 76 61  |seContext;)Ljava|
000000a0  2f 75 74 69 6c 2f 53 65  74 3c 4c 6f 72 67 2f 61  |/util/Set<Lorg/a|
000000b0  70 61 63 68 65 2f 74 69  6b 61 2f 6d 69 6d 65 2f  |pache/tika/mime/|
000000c0  4d 65 64 69 61 54 79 70  65 3b 3e 3b 01 00 05 65  |MediaType;>;...e|
000000d0  6d 62 65 64 01 00 76 28  4c 6f 72 67 2f 61 70 61  |mbed..v(Lorg/apa|
000000e0  63 68 65 2f 74 69 6b 61  2f 6d 65 74 61 64 61 74  |che/tika/metadat|
000000f0  61 2f 4d 65 74 61 64 61  74 61 3b 4c 6a 61 76 61  |a/Metadata;Ljava|
00000100  2f 69 6f 2f 49 6e 70 75  74 53 74 72 65 61 6d 3b  |/io/InputStream;|
00000110  4c 6a 61 76 61 2f 69 6f  2f 4f 75 74 70 75 74 53  |Ljava/io/OutputS|
00000120  74 72 65 61 6d 3b 4c 6f  72 67 2f 61 70 61 63 68  |tream;Lorg/apach|
00000130  65 2f 74 69 6b 61 2f 70  61 72 73 65 72 2f 50 61  |e/tika/parser/Pa|
00000140  72 73 65 43 6f 6e 74 65  78 74 3b 29 56 01 00 0a  |rseContext;)V...|
00000150  45 78 63 65 70 74 69 6f  6e 73 07 00 12 07 00 13  |Exceptions......|
00000160  01 00 0a 53 6f 75 72 63  65 46 69 6c 65 01 00 0d  |...SourceFile...|
00000170  45 6d 62 65 64 64 65 72  2e 6a 61 76 61 01 00 21  |Embedder.java..!|
00000180  6f 72 67 2f 61 70 61 63  68 65 2f 74 69 6b 61 2f  |org/apache/tika/|
00000190  65 6d 62 65 64 64 65 72  2f 45 6d 62 65 64 64 65  |embedder/Embedde|
000001a0  72 01 00 10 6a 61 76 61  2f 6c 61 6e 67 2f 4f 62  |r...java/lang/Ob|
000001b0  6a 65 63 74 01 00 14 6a  61 76 61 2f 69 6f 2f 53  |ject...java/io/S|
000001c0  65 72 69 61 6c 69 7a 61  62 6c 65 01 00 13 6a 61  |erializable...ja|
000001d0  76 61 2f 69 6f 2f 49 4f  45 78 63 65 70 74 69 6f  |va/io/IOExceptio|
000001e0  6e 01 00 27 6f 72 67 2f  61 70 61 63 68 65 2f 74  |n..'org/apache/t|
000001f0  69 6b 61 2f 65 78 63 65  70 74 69 6f 6e 2f 54 69  |ika/exception/Ti|
00000200  6b 61 45 78 63 65 70 74  69 6f 6e 06 01 00 01 00  |kaException.....|
00000210  02 00 01 00 03 00 00 00  02 04 01 00 04 00 05 00  |................|
00000220  01 00 06 00 00 00 02 00  07 04 01 00 08 00 09 00  |................|
00000230  01 00 0a 00 00 00 06 00  02 00 0b 00 0c 00 01 00  |................|
00000240  0d 00 00 00 02 00 0e                              |.......|
00000247

What's amazing is the content after "...SourceFile..." is the package location of this class.

It is possible to write a program which can scan every .class file and determine their location in the directory. However, there is Java 9, Java 10 and Java 11 will coming soon. different java version will cause the different binary content of .class files. And it might be different between JDK and OpenJDK. So it is not compatible enough. But on the other hand, it shows that it is possible to determine the package location of a certain .class file without other infomation.

Hope someone can provide some ideas, thanks sincerely!

回答1:

Generally, it is better to fix the build system issues, to generate the correct directory structure in the first place, rather than trying to fix it after the fact. One problem I see, is that classes from different packages may have the same simple name, so if their class files are written to the same flat directory, one of them will overwrite the other and this data loss can not be fixed afterwards.

Generally, the constant pool at the beginning of the class file contains the qualified class name, so it is possible to extract it, but you need to understand the class file structure to pick the right string. The following method will parse a class file and extract the name (in its internal form):

static String getClassName(ByteBuffer buf) {
    if(buf.order(ByteOrder.BIG_ENDIAN).getInt()!=0xCAFEBABE) {
        throw new IllegalArgumentException("not a valid class file");
    }
    int minor=buf.getChar(), ver=buf.getChar(), poolSize=buf.getChar();
    int[] pool = new int[poolSize];
    //System.out.println("version "+ver+'.'+minor);
    for(int ix=1; ix<poolSize; ix++) {
        String s; int index1=-1, index2=-1;
        byte tag = buf.get();
        switch(tag) {
            default: throw new UnsupportedOperationException(
                    "unknown pool item type "+buf.get(buf.position()-1));
            case CONSTANT_Utf8:
                buf.position((pool[ix]=buf.position())+buf.getChar()+2); continue;
            case CONSTANT_Module: case CONSTANT_Package: case CONSTANT_Class:
            case CONSTANT_String: case CONSTANT_MethodType:
                pool[ix]=buf.getChar(); break;
            case CONSTANT_FieldRef: case CONSTANT_MethodRef:
            case CONSTANT_InterfaceMethodRef: case CONSTANT_NameAndType:
            case CONSTANT_InvokeDynamic: case CONSTANT_Dynamic:
            case CONSTANT_Integer: case CONSTANT_Float:
                buf.position(buf.position()+4); break;
            case CONSTANT_Double: case CONSTANT_Long:
                buf.position(buf.position()+8); ix++; break;
            case CONSTANT_MethodHandle: buf.position(buf.position()+3); break;
        }
    }
    int access = buf.getChar(), thisClass = buf.getChar();
    buf.position(pool[pool[thisClass]]);
    return decodeString(buf);
}
private static String decodeString(ByteBuffer buf) {
    int size=buf.getChar(), oldLimit=buf.limit();
    buf.limit(buf.position()+size);
    StringBuilder sb=new StringBuilder(size+(size>>1));
    while(buf.hasRemaining()) {
        byte b=buf.get();
        if(b>0) sb.append((char)b);
        else {
            int b2 = buf.get();
            if((b&0xf0)!=0xe0)
                sb.append((char)((b&0x1F)<<6 | b2&0x3F));
            else {
                int b3 = buf.get();
                sb.append((char)((b&0x0F)<<12 | (b2&0x3F)<<6 | b3&0x3F));
            }
        }
    }
    buf.limit(oldLimit);
    return sb.toString();
}
private static final byte CONSTANT_Utf8 = 1, CONSTANT_Integer = 3,
    CONSTANT_Float = 4, CONSTANT_Long = 5, CONSTANT_Double = 6,
    CONSTANT_Class = 7, CONSTANT_String = 8, CONSTANT_FieldRef = 9,
    CONSTANT_MethodRef = 10, CONSTANT_InterfaceMethodRef = 11,
    CONSTANT_NameAndType = 12, CONSTANT_MethodHandle = 15,
    CONSTANT_MethodType = 16, CONSTANT_Dynamic = 17, CONSTANT_InvokeDynamic = 18,
    CONSTANT_Module = 19, CONSTANT_Package = 20;

This can be used to fix a wrong file location like this:

static void checkAndMoveClassFile(Path path) throws IOException {
    ByteBuffer bb;
    try(FileChannel ch=FileChannel.open(path, StandardOpenOption.READ)) {
        bb=ByteBuffer.allocate((int)ch.size());
        while(bb.hasRemaining()) ch.read(bb);
        bb.flip();
    }
    String name = getClassName(bb);
    Path newPath = path.resolveSibling(name+".class");
    if(!path.equals(newPath)) {
        System.out.println("moving "+path+" to "+newPath);
        Files.createDirectories(newPath.getParent());
        Files.move(path, newPath);
    }
}

which you can run over a directory easily

Files.list(dirPath)
     .filter(p -> p.getFileName().toString().endsWith(".class"))
     .forEach(p -> {
         try { checkAndMoveClassFile(p); }
         catch (IOException ex) { throw new UncheckedIOException(ex); }
     });