Zlib compression enlarging file

2019-09-21 02:49发布

问题:

I'm trying to use zlib in an iPhone app to compress a text file into a gzip file as a test. I am using the following code

const char *s = [[Path stringByReplacingOccurrencesOfString:[NSString stringWithFormat:@".%@", [Path pathExtension]] withString:@".gz"] UTF8String];
gzFile *fi = (gzFile *)gzopen(s, "wb");
const char *c = readFile(Path.UTF8String);
gzwrite(fi, c, strlen(c));
gzclose(fi);

where readFile() returns a const char* that was obtained from the file using the fgets() function. The problem is, when I use this to compress a file, it doesn't compress it, but instead the gzip file is larger than original file. For example, I have a text file that is 90 bytes, and after using this method the size of the gzip is 98 bytes. Why isn't the gzip smaller than the original file?

回答1:

The GZip format includes fixed-size header information. Because you are compressing so little data, the header information is larger than the space you are saving.

90 bytes is generally not worth compressing.

http://www.gzip.org/zlib/rfc-gzip.html#header-trailer



回答2:

Regardless of the compression algorithm used there's always a chance that the generated data will be larger than the input otherwise it wouldn't be possible to encode any combination of input bit patterns.

As already stated in your special case a very small file size compared to header overhead seems to be the problem.

Nevertheless it might be good to keep in mind that there's never a guarantee the "compressed" file size will be smaller.



回答3:

  1. The data you are trying to compress is too small and there is not a lot of redundancy, so there is nothing left to compress. Compression algorithms work, to put it very simply, by eliminating repeating sequences in data. In 90 bytes, you probably don't have much redundancy, unless it's text like "aaaaaaa....".
  2. Fixed header overhead, as already mentioned.

Try a bigger data file.



标签: c++ ios c zlib