Compressing strings for client/server transport in

2019-02-11 06:37发布

I work with a propriety client/server message format that restricts what I can send over the wire. I can't send a serialized object, I have to store the data in the message as a String. The data I am sending are large comma-separated values, and I want to compress the data before I pack it into the message as a String.

I attempted to use Deflater/Inflater to achieve this, but somewhere along the line I am getting stuck.

I am using the two methods below to deflate/inflate. However, passing the result of the compressString() method to decompressStringMethod() returns a null result.

public String compressString(String data) {
  Deflater deflater = new Deflater();
  byte[] target = new byte[100];
  try {
   deflater.setInput(data.getBytes(UTF8_CHARSET));
   deflater.finish();
   int deflateLength = deflater.deflate(target);
   return new String(target);
  } catch (UnsupportedEncodingException e) {
   //TODO
  }

  return data;
 }

 public String decompressString(String data) {

  String result = null;
  try {
   byte[] input = data.getBytes();

   Inflater inflater = new Inflater();
   int inputLength = input.length;
   inflater.setInput(input, 0, inputLength);

   byte[] output = new byte[100];
   int resultLength = inflater.inflate(output);
   inflater.end();

   result = new String(output, 0, resultLength, UTF8_CHARSET);
  } catch (DataFormatException e) {
   // TODO Auto-generated catch block
   e.printStackTrace();
  } catch (UnsupportedEncodingException e) {
   // TODO Auto-generated catch block
   e.printStackTrace();
  }

  return result;
 }

6条回答
你好瞎i
2楼-- · 2019-02-11 06:45

If you have a piece of code which seems to be silently failing, perhaps you shouldn't catch and swallow Exceptions:

catch (UnsupportedEncodingException e) {
    //TODO
}

But the real reason why decompress returns null is because your exception handling doesn't specify what to do with result when you catch an exception - result is left as null. Are you checking the output to see if any Exceptions are occuring?

If I run your decompress() on a badly formatted String, Inflater throws me this DataFormatException:

java.util.zip.DataFormatException: incorrect header check
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:223)
    at java.util.zip.Inflater.inflate(Inflater.java:240)
查看更多
Summer. ? 凉城
3楼-- · 2019-02-11 06:47

From what I can tell, your current approach is:

  1. Convert String to byte array using getBytes("UTF-8").
  2. Compress byte array
  3. Convert compressed byte array to String using new String(bytes, ..., "UTF-8").
  4. Transmit compressed string
  5. Receive compressed string
  6. Convert compressed string to byte array using getBytes("UTF-8").
  7. Decompress byte array
  8. Convert decompressed byte array to String using new String(bytes, ..., "UTF-8").

The problem with this approach is in step 3. When you compress the byte array, you create a sequence of bytes which may no longer be valid UTF-8. The result will be an exception in step 3.

The solution is to use a "bytes to characters" encoding scheme like Base64 to turn the compressed bytes into a transmissible string. In other words, replace step 3 with a call to a Base64 encode function, and step 6 with a call to a Base64 decode function.

Notes:

  1. For small strings, compressing and encoding is likely to actually increase the size of the transmitted string.
  2. If the compacted String is going to be incorporated into a URL, you may want to pick a different encoding to Base64 that avoids characters that need to be URL escaped.
  3. Depending on the nature of the data you are transmitting, you may find that a domain specific compression works better than a generic one. Consider compressing the data before creating the comma-separated string. Consider alternatives to comma-separated strings.
查看更多
Animai°情兽
4楼-- · 2019-02-11 06:52

I was facing similar issue which was resolved by base64 decoding the input.
i.e instead of

data.getBytes(UTF8_CHARSET)  

i tried

Base64.decodeBase64(data)  

and it worked.

查看更多
forever°为你锁心
5楼-- · 2019-02-11 06:57

Inflator/Deflator is not a solution for compress string. I think GZIPInputString and GZIPOutputString is the proper tool to compress the string

查看更多
Explosion°爆炸
6楼-- · 2019-02-11 07:00

The problem is that you convert compressed bytes to a string, which breaks the data. Your compressString and decompressString should work on byte[]

EDIT: Here is revised version. It works

EDIT2: And about base64. you're sending bytes, not strings. You don't need base64.

public static void main(String[] args) {
    String input = "Test input";
    byte[] data = new byte[100];

    int len = compressString(input, data, data.length);

    String output = decompressString(data, len);

    if (!input.equals(output)) {
        System.out.println("Test failed");
    }

    System.out.println(input + " " + output);
}

public static int compressString(String data, byte[] output, int len) {
    Deflater deflater = new Deflater();
    deflater.setInput(data.getBytes(Charset.forName("utf-8")));
    deflater.finish();
    return deflater.deflate(output, 0, len);
}

public static String decompressString(byte[] input, int len) {

    String result = null;
    try {
        Inflater inflater = new Inflater();
        inflater.setInput(input, 0, len);

        byte[] output = new byte[100]; //todo may oveflow, find better solution
        int resultLength = inflater.inflate(output);
        inflater.end();

        result = new String(output, 0, resultLength, Charset.forName("utf-8"));
    } catch (DataFormatException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

    return result;
}
查看更多
Rolldiameter
7楼-- · 2019-02-11 07:07

TO ME: write compress algorithm myself is difficult but writing binary to string is not. So if I were you, I will serialize the object normally and zip it with compression (as provided by ZipFile) then convert to string using something like Base64 Encode/Decode.

I actually have BASE64 ENCODE/DECODE functions. If you wanted I can post it here.

查看更多
登录 后发表回答