I would like to write a method that read several XML files inside a ZIP, from a single InputStream.
The method would open a ZipInputStream, and on each xml file, get the corresponding InputStream, and give it to my XML parser. Here is the skeleton of the method :
private void readZip(InputStream is) throws IOException {
ZipInputStream zis = new ZipInputStream(is);
ZipEntry entry = zis.getNextEntry();
while (entry != null) {
if (entry.getName().endsWith(".xml")) {
// READ THE STREAM
}
entry = zis.getNextEntry();
}
}
The problematic part is the "// READ THE STREAM". I have a working solution, which consist to create a ByteArrayInputStream, and feed my parser with it. But it uses a buffer, and for large files I get an OutOfMemoryError. Here is the code, if someone is still interested :
int count;
byte buffer[] = new byte[2048];
ByteArrayOutputStream out = new ByteArrayOutputStream();
while ((count = zis.read(buffer)) != -1) { out.write(buffer, 0, count); }
InputStream is = new ByteArrayInputStream(out.toByteArray());
The ideal solution would be to feed the parser with the original ZipInputStream. It should works, because it works if I just print the entry content with a Scanner :
Scanner sc = new Scanner(zis);
while (sc.hasNextLine())
{
System.out.println(sc.nextLine());
}
But... The parser I'm currently using (jdom2, but I also tried with javax.xml.parsers.DocumentBuilderFactory) closes the stream after parsing the data :/ . So I'm unable to get the next entry and continue.
So finally the question is :
- Does anybody know a DOM parser that doesn't close its stream ?
- Is there another way to have an InputStream from a ZipEntry ?
Thanks.
You could wrap the ZipInputStream and intercept the call to
close()
.A small improvement on Tim's solution: The problem with having to call allowToBeClosed() before close() is that it makes closing the ZipInputStream properly when handling exceptions tricky and will break Java 7's try-with-resources statement.
I suggest creating a wrapper class as follows:
which can then safely be used as follows:
Thanks to halfbit, I ended up with my own ZipInputStream class, which overrides the close method :