How can I easily compress and decompress Strings t

2019-01-22 22:47发布

I have some strings that are roughly 10K characters each. There is plenty of repetition in them. They are serialized JSON objects. I'd like to easily compress them into a byte array, and uncompress them from a byte array.

How can I most easily do this? I'm looking for methods so I can do the following:

String original = "....long string here with 10K characters...";
byte[] compressed = StringCompressor.compress(original);
String decompressed = StringCompressor.decompress(compressed);
assert(original.equals(decompressed);

3条回答
\"骚年 ilove
2楼-- · 2019-01-22 23:04

I made a library to solve the problem of compressing generic Strings (expecially short ones). It tries to compress the String using various algorithms (plain utf-8, 5bit encoding for latin letters, huffman encoding, gzip for long Strings) and chooses the one with the shortest result (in the worst case, it will choose the utf-8 encoding, so that you never risk to lose space).

I hope it may be useful, here's the link https://github.com/lithedream/lithestring

EDIT: I realized that your Strings are always "long", my library defaults on gzip for those sizes, I fear I cannot do better for you.

查看更多
混吃等死
3楼-- · 2019-01-22 23:08

Peter Lawrey's answer can be improved a bit using this less complex code for the decompress function

    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    try {
        OutputStream out = new InflaterOutputStream(baos);
        out.write(bytes);
        out.close();
        return new String(baos.toByteArray(), "UTF-8");
    } catch (IOException e) {
        throw new AssertionError(e);
    }
查看更多
Lonely孤独者°
4楼-- · 2019-01-22 23:10

You can try

enum StringCompressor {
    ;
    public static byte[] compress(String text) {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        try {
            OutputStream out = new DeflaterOutputStream(baos);
            out.write(text.getBytes("UTF-8"));
            out.close();
        } catch (IOException e) {
            throw new AssertionError(e);
        }
        return baos.toByteArray();
    }

    public static String decompress(byte[] bytes) {
        InputStream in = new InflaterInputStream(new ByteArrayInputStream(bytes));
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        try {
            byte[] buffer = new byte[8192];
            int len;
            while((len = in.read(buffer))>0)
                baos.write(buffer, 0, len);
            return new String(baos.toByteArray(), "UTF-8");
        } catch (IOException e) {
            throw new AssertionError(e);
        }
    }
}
查看更多
登录 后发表回答