How to decompress an XZ file faster in Java?

2019-06-28 02:14发布

问题:

My SQLite db file of size 85MB is compressed using XZ format and its size has been reduced to 16MB. I use the following code (and a JAR provided by XZ for Java) to decompress it in Android Jelly Bean:

try { 
    FileInputStream fin = new FileInputStream(path + "myFile.xz");
    BufferedInputStream in = new BufferedInputStream(fin);
    FileOutputStream out = new FileOutputStream(des + "myDecompressed");
    XZInputStream xzIn = new XZInputStream(in);
    final byte[] buffer = new byte[8192];
    int n = 0;
    while (-1 != (n = xzIn.read(buffer))) {
        out.write(buffer, 0, n);
    } 
    out.close();
    xzIn.close();
}
catch(Exception e) { 
    Log.e("Decompress", "unzip", e); 
}

The decompression is done successfully, but it takes more than two minutes to complete. I think this is very long because the compressed file is only 16MB and uncompressed file is only 85MB.

I wonder whether I have done something wrong with the code or there is a way to speed up this decompressing process.

回答1:

I think that there is little you can do to make this faster. If it is taking 2 minutes to decompress 16Mb to 85Mb, the chances are that most of that time is spent in the actual decompression, and a significant part of the rest is in the actual file I/O ... at the physical level.

Certainly, there is nothing obviously inefficient about your code. You are reading using a BufferedInputStream and decoding / writing using a large buffer. So you will be doing the I/O syscalls efficiently. (Adding a BufferedOutputStream won't make any difference because you are already doing large writes from a 8192 byte buffer.)


The best I can suggest is that you profile your code to see where the hotspots really are. But I suspect that you won't find anything that can be improved enough to make a difference.


I want to go for XZ because it has the best compression level in my case, which somewhat saves the downloading time... (with zip, the unzipping of this file takes only about 15 seconds!

Well, extra CPU time in decompression is the price that you pay for using a compression algorithm cranked up to the max. You need to decide which is more important to your users: faster downloads, or faster decompression (installation?) of the database.

FWIW, ZIP decompression is probably implemented in a native library, not in pure Java. It certainly is for Oracle / OpenJDK JVMs.



回答2:

The least you should do is wrap FileOutputStream int a BufferedOutputStream, there are very few instances where you should not use BufferedInputStream/BufferedOutputStream. Try it and see how long it would take now.