When I try to open a .xlsx file in POI, I get an exception:
java.lang.IllegalArgumentException: The supplied POIFSFileSystem does not contain a BIFF8 'Workbook' entry. Is it really an excel file?
at org.apache.poi.hssf.usermodel.HSSFWorkbook.getWorkbookDirEntryName(HSSFWorkbook.java:223)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:245)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:188)
I notice that the code is considering it a .xls file even though the name is .xlsx, and I am using WorkbookFactory.create(fileInputStream);
to open the file.
I tried renaming the file to .zip and opening in WinZip and I get an error - invalid zip file.
The file does open in Excel, and if I save it (without making a single change), then it opens correctly in POI as well.
From what you say, which is that you have a .xlsx
file but it's turning up in HSSF with no Workbook entry, I'm going to deduce that you have an Encrypted .xlsx file
There are basically 4 ways of handling a .xls or .xlsx file:
- Unencrypted .xls file - OLE2 storage, works with HSSF
- Encrypted .xlsx file - OLE2 storage, some records encrypted, works with HSSF if you supply the password
- Unencrypted .xlsx file - OOXML (zip of xml) storage, works with XSSF
- Encrypted .xlsx file - Stored in OLE2, which holds encrypted part, which wraps OOXML, which ..... which finally works with XSSF (it's nasty)
So, what I think is happening is that you're checking the type of the file, seeing it's OLE2, then passing that to HSSF. HSSF looks for the bits it works with, can't see them, and gives up
What you need to do is follow the instructions on the POI Encrypted Documents page. Basically, your code needs to be something like:
EncryptionInfo info = new EncryptionInfo(filesystem);
Decryptor d = Decryptor.getInstance(info);
try {
if (!d.verifyPassword(password)) {
throw new RuntimeException("Unable to process: document is encrypted");
}
InputStream dataStream = d.getDataStream(filesystem);
XSSFWorkbook wb = new XSSFWorkbook(dataStream);
// Process XLSX file here
} catch (GeneralSecurityException ex) {
throw new RuntimeException("Unable to process encrypted document", ex);
}
If you use the latest version of POI, it'll now throw an EncryptedDocumentException
if you give an encrypted .xlsx file to HSSF, so you can work out more easily where you went wrong