Best compression technique for binary data? [close

2019-03-19 06:15发布

问题:

I have a large binary file that represents the alpha channel for each pixel in an image - 0 for transparent, 1 for anything else. This binary data needs to be dynamically loaded from a text file, and it would be useful to get the maximum possible compression in it. De-compression times aren't majorly important (unless we're talking a jump of say a minute to an hour), but the files need to be as small as possible.

Methods we've tried so far are using run length encoding, then a huffman coding, then converting the binary data to base64, and run length encoding but differentiating between zero and one using numeric values for one and alphabetical equivalents for zero (seems to give the best results). However, we're wondering if there's a better solution than either of these as we're approaching it from a logical standpoint, rather than looking at all possible methods.

回答1:

As external libraries were out fo the question, I created a custom solution for this. The system used run length encoding to compress the data, then the RLE encoded data was represented in base32 (32 characters for the zeroes, and the matching set for ones). This allowed us to represent files approximately 5MB in size with only around 30KB, without any loss.



回答2:

I agree, you should be best off by using an existing proven image format. If you must do it yourself you will probably still end up with something that is very close to some existing tech.

I would think that I would like to store how many times the following byte is repeated |10|1|1|0|3|1|5|0

Would produce

1111111111011100000

But if one looks at this and optimize it on a byte level you would soon se that this is almost exactly what RLE -compresion does. So long answer made short, take a look at RLE ;)

Good luck!



回答3:

Check out 7-Zip. It has very good compression ratios, often a tenth the size of zip, and has language bindings for many programming languages.

http://www.7-zip.org/sdk.html



回答4:

There are some comparative tests of lossless archivers for photo images. You may look at one of them at: http://qlic.altervista.org/LPCB.html

You see that there are dozens of such archivers. For everyday use I'd recommend 7-zip.



标签: compression