Is there any way to check if InputStream has been gzipped? Here's the code:
public static InputStream decompressStream(InputStream input) {
try {
GZIPInputStream gs = new GZIPInputStream(input);
return gs;
} catch (IOException e) {
logger.info("Input stream not in the GZIP format, using standard format");
return input;
}
}
I tried this way but it doesn't work as expected - values read from the stream are invalid. EDIT: Added the method I use to compress data:
public static byte[] compress(byte[] content) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try {
GZIPOutputStream gs = new GZIPOutputStream(baos);
gs.write(content);
gs.close();
} catch (IOException e) {
logger.error("Fatal error occured while compressing data");
throw new RuntimeException(e);
}
double ratio = (1.0f * content.length / baos.size());
if (ratio > 1) {
logger.info("Compression ratio equals " + ratio);
return baos.toByteArray();
}
logger.info("Compression not needed");
return content;
}
Building on the answer by @biziclop - this version uses the GZIP_MAGIC header and additionally is safe for empty or single byte data streams.
It's not foolproof but it's probably the easiest and doesn't rely on any external data. Like all decent formats, GZip too begins with a magic number which can be quickly checked without reading the entire stream.
(Source for the magic number: GZip file format specification)
Update: I've just dicovered that there is also a constant called
GZIP_MAGIC
inGZipInputStream
which contains this value, so if you really want to, you can use the lower two bytes of it.I found this useful example that provides a clean implementation of
isCompressed()
:I tested it with success:
Wrap the original stream in a BufferedInputStream, then wrap that in a GZipInputStream. Next try to extract a ZipEntry. If this works, it's a zip file. Then you can use "mark" and "reset" in the BufferedInputStream to return to the initial position in the stream, after your check.
In that case you need to check if HTTP
Content-Encoding
response header equals togzip
.This all is clearly specified in HTTP spec.
Update: as per the way how you compressed the source of the stream: this ratio check is pretty... insane. Get rid of it. The same length does not necessarily mean that the bytes are the same. Let it always return the gzipped stream so that you can always expect a gzipped stream and just apply
GZIPInputStream
without nasty checks.SimpleMagic is a Java library for resolving content types: