Error reading Excel .XLSX with Apache POI

2020-07-10 06:44发布

问题:

I am using Apache POI 3.8 libraries to read an XLSX file in a web application. The following code works perfectly fine from a Java console app:

InputStream inputFS = new FileInputStream("test.xlsx");
Workbook workbook = new XSSFWorkbook(inputFS); // below exception is thrown on this line
Sheet sheet = workbook.getSheetAt(0);

but throws a "read error" when used in the web application. A relevant extract of the stack trace is pasted below:

java.io.IOException: Read error
at java.io.FileInputStream.readBytes(Native Method) ~[na:1.6.0_31]
at java.io.FileInputStream.read(Unknown Source) ~[na:1.6.0_31]
at java.io.FilterInputStream.read(Unknown Source) ~[na:1.6.0_31]
at java.io.PushbackInputStream.read(Unknown Source) ~[na:1.6.0_31]
at java.util.zip.ZipInputStream.readFully(Unknown Source) ~[na:1.6.0_31]
at java.util.zip.ZipInputStream.readLOC(Unknown Source) ~[na:1.6.0_31]
at java.util.zip.ZipInputStream.getNextEntry(Unknown Source) ~[na:1.6.0_31]
at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:51) ~[poi-ooxml-3.8-20120326.jar:3.8]
at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:83) ~[poi-ooxml-3.8-20120326.jar:3.8]
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:228) ~[poi-ooxml-3.8-20120326.jar:3.8]
at org.apache.poi.util.PackageHelper.open(PackageHelper.java:39) ~[poi-ooxml-3.8-20120326.jar:3.8]
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:187) ~[poi-ooxml-3.8-20120326.jar:3.8]
at com.corp.ReportManager.parseExcelReport(ReportManager.java:575) [ReportManager.class:na]

The following JARs are included in the classpath (in the same order):

poi-3.8-20120326.jar
poi-ooxml-3.8-20120326.jar
poi-ooxml-schemas-3.8-20120326.jar
xbean.jar
dom4j-1.6.1.jar

There does not seem to be an memory related issues since I gathered some heap utilization stats just before invoking the above code. The XLSX file is 1.15 MB in size.

##### Heap utilization statistics [MB] #####
Used Memory:13 MB
Free Memory:9 MB
Total Memory:23 MB
Max Memory:247 MB

回答1:

The method using the above code has a single parameter - FileInputStream. The first line in the code snippet is very much part of the code, but part of the invoking method. Since the method in question did not have knowledge of the Excel format or even a file extension to make an educated guess, I decided that I would first try to read the FileInputStream using HSSF API as below:

Sheet sheet = null;
try {

    POIFSFileSystem poifs = new POIFSFileSystem(inputFS);
    Workbook workbook = new HSSFWorkbook(poifs);
    sheet = workbook.getSheetAt(0);
}
catch (Exception e) {
}

if (sheet == null) {

    try {

        Workbook workbook = new XSSFWorkbook(inputFS);
        sheet = workbook.getSheetAt(0);
    }
    catch (Exception e) {
    }
}

The problem with the above code is that the state of the inputFS object during the second attempt of opening it via the XSSF API is unknown. And this yielded a read error. I replaced the above with the following code, which works fine and the issue appears to be resolved:

Sheet sheet = null;
try {

    Workbook workbook = WorkbookFactory.create(inputFS);
    sheet = workbook.getSheetAt(0);
}
catch (Exception e) {
}

I tested this with both XLS (older, binary) and XLSX (newer, XML-based) formats and it works. Thanks for everyone's help and input!



回答2:

The exception indicates that there's something up with your InputStream. However, if you have a file, then pass that in to POI directly!. Using an InputStream requires buffering of everything into memory, which eats up space. Since you don't need to do that buffering, don't! Avoiding that buffering ought to fix your problem anyway

If you're running with the latest nightly builds of POI, then it's very easy. Your code becomes:

File file = new File("test.xlsx");
OPCPackage opcPackage = OPCPackage.open(file);
XSSFWorkbook workbook = new XSSFWorkbook(opcPackage);

Otherwise, it's very similar:

File file = new File("test.xlsx");
OPCPackage opcPackage = OPCPackage.open(file.getAbsolutePath());
XSSFWorkbook workbook = new XSSFWorkbook(opcPackage);

If you're not sure if your file is a HSSFWorkbook or XSSFWorkbook, then you can use the WorkbookFactory to open the appropriate one for you:

File file = new File("test.xlsx");
Workbook workbook = WorkbookFactory.create(file);


回答3:

it looks like you need to use their XSSF API



回答4:

I have same error, I have just updated the pom dependencies with same version. It worked.
        <!-- https://mvnrepository.com/artifact/org.apache.poi/poi -->
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi</artifactId>
            <version>4.1.0</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.poi/poi-ooxml -->
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml</artifactId>
            <version>4.1.0</version>
        </dependency>


回答5:

use this jar

<dependency>
    <groupId>org.apache.xmlbeans</groupId>
    <artifactId>xmlbeans</artifactId>
    <version>2.3.0</version>
</dependency>