Exception when reading XLSM workbook via Apache PO

2019-08-04 08:37发布

问题:

I'm trying to read .xlsm file using Apache POI library (3.8-beta5):

Workbook wb = null;
try {
    wb =  WorkbookFactory.create(isXLSFile);
} catch (IOException e) {
...

nothing complicated. While generally documents are being read well, one document throws an exception:

Caused by: java.lang.IllegalStateException: A sheet hyperlink must either have a location, or a relationship. Found:
<xml-fragment ref="H13" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac"/>
    at org.apache.poi.xssf.usermodel.XSSFHyperlink.<init>(XSSFHyperlink.java:72)
    at org.apache.poi.xssf.usermodel.XSSFSheet.initHyperlinks(XSSFSheet.java:250)
    at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:203)
    at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:175)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:260)
    at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:159)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:174)
    at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:67)

It's interesting that if I open file in LibreOffice (no MS Office on my machine) and re-save it (keeping the format) then document is read fine. As I understand this problem has something to do with validity of document structure - so this is data problem and not error in library code (or isn't it?). But maybe there is a way to suppress such errors?

[UPDATE] The fix mentioned in comments to accepted answer was integrated in 3.8 version of Apache POI released on March 26th 2012.

回答1:

I just got a file like this from a user in a Google Refine bug report. My conclusion is that it's an Apache POI bug and they're being overly strict by complaining about a reference which isn't used. Additionally, throwing a RuntimeException instead of a declared (checked) exception is kind of rude. I've filed this bug report: https://issues.apache.org/bugzilla/show_bug.cgi?id=52716