Invalid header signature; IOException with Apache

2019-01-20 08:16发布

问题:

I'm getting:

java.io.IOException: Invalid header signature; read 0x000201060000FFFE, expected 0xE11AB1A1E011CFD0

when trying to add some custom properties to an Excel document using apache POI HPSF.

I'm completely sure the file is Excel OLE2 (not HTML, XML or something else that Excel doesn't complain about).

This is a relevant part of my code:

try {
     final POIFSFileSystem poifs = new POIFSFileSystem(event.getStream());
     final DirectoryEntry dir = poifs.getRoot();
     final DocumentEntry dsiEntry = (DocumentEntry)
             dir.getEntry(DocumentSummaryInformation.DEFAULT_STREAM_NAME);

     final DocumentInputStream dis = new DocumentInputStream(dsiEntry);
     final PropertySet props = new PropertySet(dis);
     dis.close();
     dsi = new DocumentSummaryInformation(props);
    }
    catch (Exception ex) {
        throw new RuntimeException
            ("Cannot create POI SummaryInformation for event: " + event +
              ", path:" + event.getPath() + 
              ", name:" + event.getPath() +
              ", cause:" + ex);
    }

I get the same error when trying with word and power point files (also OLE2).

I'm completely out of ideas so any help/pointers are greatly appreciated :)

回答1:

If you flip the signature number round, you'll see the bytes of the start of your file:

0x000201060000FFFE -> 0xFE 0xFF 0x00 0x00 0x06 0x01 0x02 00

The first two bytes look like a Unicode BOM, 0xFEFF means 16 bit little endian. You then have some low control bytes, the hex codes for 0 then 258 then 2, so maybe it isn't a text file after all.

That file really isn't an OLE2 file, and POI is right to give you the error. I don't know what it is, but I'm guessing that perhaps it might be part of an OLE2 file without it's outer OLE2 wrapper? If you can open it with office, do a save-as and POI should be fine to open that. As it stands, that header isn't an OLE2 file header so POI can't open it for you.



回答2:

In my case, the file was a CSV file saved with the .xls extension. Excel was able to open it without a problem, but POI was not.

If I find a better/more general solution, I'll come back and write it up here.



回答3:

Try save it as csv file directly and use opencsv for your operations.
Use the following link to know about opencsv.
http://opencsv.sourceforge.net/#what-is-opencsv

Excel can open a csv, xls or even html table saved as xls.

So you can save the file as file_name.csv and can use opencsv for reading the file in your code.

Or else you can the file once in excel by save As excel 97-2003 workbook.

And then, POI itself can read the file :-)



回答4:

because you saved your file by Excel 2013. save As your file as excel 97-2003 format.



回答5:

I had the same problem with an xls file generated by software, I am forced to save files with Excel (the same format) to be able to read with apache POI.