Checking if a stream is a zip file

2019-03-30 04:52发布

We have a requirement to determine whether an incoming InputStream is a reference to an zip file or zip data. We do not have reference to the underlying source of the stream. We aim to copy the contents of this stream into an OutputStream directed at an alternate location.

I tried reading the stream using ZipInputStream and extracting a ZipEntry. The ZipEntry is null if the stream is a regular file - as expected - however, in checking for a ZipEntry I loose the initial couple of bytes from the stream. Hence, by the time I know that the stream is a regular stream, I have already lost initial data from the stream.

Any thoughts around how to check if the InputStream is an archive without data loss would be helpful.

Thanks.

标签: java stream zip
5条回答
干净又极端
2楼-- · 2019-03-30 05:17

You can check first bytes of stream for ZIP local header signature (PK 0x03 0x04), that would be enough for most cases. If you need more precision, you should take last ~100 bytes and check for central directory locator fields.

查看更多
Animai°情兽
3楼-- · 2019-03-30 05:21

This is how I did it.

Using mark/reset to restore the stream if the GZIPInputStream detects incorrect zip format (throws the ZipException).

/**
 * Wraps the input stream with GZIPInputStream if needed. 
 * @param inputStream
 * @return
 * @throws IOException
 */
private InputStream wrapIfZip(InputStream inputStream) throws IOException {
    if (!inputStream.markSupported()) {
        inputStream = new BufferedInputStream(inputStream);
    }
    inputStream.mark(1000);
    try {
        return new GZIPInputStream(inputStream);
    } catch (ZipException e) {
        inputStream.reset();
        return inputStream;
    }
}
查看更多
可以哭但决不认输i
4楼-- · 2019-03-30 05:25

You have described a java.io.PushbackInputStream - in addition to read(), it has an unread(byte[]) which allows you push them bck to the front of the stream, and to re-read() them again.

It's in java.io since JDK1.0 (though I admit I haven't seen a use for it until today).

查看更多
对你真心纯属浪费
5楼-- · 2019-03-30 05:28

It sounds a bit like a hack, but you could implement a proxy java.io.InputStream to sit between ZipInputStream and the stream you originally passed to ZipInputStream's constructor. Your proxy would stream to a buffer until you know whether it's a ZIP file or not. If not, then the buffer saves your day.

查看更多
Rolldiameter
6楼-- · 2019-03-30 05:40

Assuming your original inputstream is not buffered, I would try wrapping the original stream in a BufferedInputStream, before wrapping that in a ZipInputStream to check. You can use "mark" and "reset" in the BufferedInputStream to return to the initial position in the stream, after your check.

查看更多
登录 后发表回答