Invalid header reading xls file

2019-02-16 21:47发布

问题:

I am reading one excel file on my local system. I am using POI jar Version 3.7, but getting error Invalid header signature; read -2300849302551019537 or in Hex 0xE011BDBFEFBDBFEF , expected -2226271756974174256 or in Hex 0xE11AB1A1E011CFD0.

Opening the xls file with Excel works fine.

The codeblock where it happens: Anybody an idea ?

/**
 * create a new HeaderBlockReader from an InputStream
 *
 * @param stream the source InputStream
 *
 * @exception IOException on errors or bad data
 */
public HeaderBlockReader(InputStream stream) throws IOException {
    // At this point, we don't know how big our
    //  block sizes are
    // So, read the first 32 bytes to check, then
    //  read the rest of the block
    byte[] blockStart = new byte[32];
    int bsCount = IOUtils.readFully(stream, blockStart);
    if(bsCount != 32) {
        throw alertShortRead(bsCount, 32);
    }

    // verify signature
    long signature = LittleEndian.getLong(blockStart, _signature_offset);

    if (signature != _signature) {
        // Is it one of the usual suspects?
        byte[] OOXML_FILE_HEADER = POIFSConstants.OOXML_FILE_HEADER;
        if(blockStart[0] == OOXML_FILE_HEADER[0] &&
            blockStart[1] == OOXML_FILE_HEADER[1] &&
            blockStart[2] == OOXML_FILE_HEADER[2] &&
            blockStart[3] == OOXML_FILE_HEADER[3]) {
            throw new OfficeXmlFileException("The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)");
        }
        if ((signature & 0xFF8FFFFFFFFFFFFFL) == 0x0010000200040009L) {
            // BIFF2 raw stream starts with BOF (sid=0x0009, size=0x0004, data=0x00t0)
            throw new IllegalArgumentException("The supplied data appears to be in BIFF2 format.  "
                    + "POI only supports BIFF8 format");
        }

        // Give a generic error
        throw new IOException("Invalid header signature; read "
                              + longToHex(signature) + ", expected "
                              + longToHex(_signature));
    }

回答1:

Just an idee, if you using maven make sure in the resource tag filtering is set to false. Otherwise maven tends to corrupt xls files in the copying phase



回答2:

That exception is telling you that your file isn't a valid OLE2-based .xls file.

Being able to open the file in Excel is no real guide - Excel will happily open any file it knows about no matter what the extension is on it. If you take a .csv file and rename it to .xls, Excel will still open it, but the renaming hasn't magically made it be in the .xls format so POI won't open it for you.

If you open the file in Excel and do Save-As, it'll let you write it out as a real Excel file. If you want to know what file it really is, try using Apache Tika - the Tika CLI with --detect ought to be able to tell you

.

How can I be sure it's not a valid file? If you look at the OLE2 file format specification doc from Microsoft, and head to section 2.2 you'll see the following:

Header Signature (8 bytes): Identification signature for the compound file structure, and MUST be set to the value 0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1.

Flip those bytes round (OLE2 is little endian) and you get 0xE11AB1A1E011CFD0, the magic number from the exception. Your file doesn't start with that magic number, as so really isn't a valid OLE2 document, and hence POI gives you that exception.



回答3:

If your project is maven project, the following code may help:

/**
 * Get input stream of excel.
 * <p>
 *     Get excel from src dir instead of target dir to avoid causing POI header exception.
 * </p>
 * @param fileName file in dir PROJECT_PATH/src/test/resources/excel/ , proceeding '/' is not needed.
 * @return
 */
private static InputStream getExcelInputStream(String fileName){
    InputStream inputStream = null;
    try{
        inputStream = new FileInputStream(getProjectPath() + "/src/test/resources/excel/" + fileName);
    }catch (URISyntaxException uriE){
        uriE.printStackTrace();
    }catch (FileNotFoundException fileE){
        fileE.printStackTrace();
    }
    return inputStream;
}

private static String getProjectPath() throws URISyntaxException{
    URL url = YourServiceImplTest.class.getResource("/");
    Path path = Paths.get(url.toURI());
    Path subPath = path.subpath(0, path.getNameCount() -2);
    return "/" + subPath.toString();
}