POI cannot open a workbook that opens in Excel

2020-04-02 06:12发布

问题:

When I try to open a .xlsx file in POI, I get an exception:

java.lang.IllegalArgumentException: The supplied POIFSFileSystem does not contain a BIFF8 'Workbook' entry. Is it really an excel file?
    at org.apache.poi.hssf.usermodel.HSSFWorkbook.getWorkbookDirEntryName(HSSFWorkbook.java:223)
    at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:245)
    at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:188)

I notice that the code is considering it a .xls file even though the name is .xlsx, and I am using WorkbookFactory.create(fileInputStream); to open the file.

I tried renaming the file to .zip and opening in WinZip and I get an error - invalid zip file.

The file does open in Excel, and if I save it (without making a single change), then it opens correctly in POI as well.

回答1:

From what you say, which is that you have a .xlsx file but it's turning up in HSSF with no Workbook entry, I'm going to deduce that you have an Encrypted .xlsx file

There are basically 4 ways of handling a .xls or .xlsx file:

  • Unencrypted .xls file - OLE2 storage, works with HSSF
  • Encrypted .xlsx file - OLE2 storage, some records encrypted, works with HSSF if you supply the password
  • Unencrypted .xlsx file - OOXML (zip of xml) storage, works with XSSF
  • Encrypted .xlsx file - Stored in OLE2, which holds encrypted part, which wraps OOXML, which ..... which finally works with XSSF (it's nasty)

So, what I think is happening is that you're checking the type of the file, seeing it's OLE2, then passing that to HSSF. HSSF looks for the bits it works with, can't see them, and gives up

What you need to do is follow the instructions on the POI Encrypted Documents page. Basically, your code needs to be something like:

EncryptionInfo info = new EncryptionInfo(filesystem);
Decryptor d = Decryptor.getInstance(info);

try {
    if (!d.verifyPassword(password)) {
        throw new RuntimeException("Unable to process: document is encrypted");
    }

    InputStream dataStream = d.getDataStream(filesystem);
    XSSFWorkbook wb = new XSSFWorkbook(dataStream);

    // Process XLSX file here
} catch (GeneralSecurityException ex) {
    throw new RuntimeException("Unable to process encrypted document", ex);
}

If you use the latest version of POI, it'll now throw an EncryptedDocumentException if you give an encrypted .xlsx file to HSSF, so you can work out more easily where you went wrong