Invalid Header Signature - Opening XLS with Apache

2019-07-17 08:01发布

问题:

I'm trying to convert a XLS file into a CSV file in java using Apache POI 3.9, however I'm getting some issues. When trying to convert the file I need to, it shows me the following error:

java.io.IOException: Invalid header signature; read 0x0010000000080209, expected 0xE11AB1A1E011CFD0
    at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:140)
    at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104)
    at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:138)
    at ExtractExcelToCSV.convertExcelToCsv(ExtractExcelToCSV.java:26)
    at ExtractExcelToCSV.main(ExtractExcelToCSV.java:60)

I think the code I'm using is completely correct (and it also works with other files). I think the problem is on XLS file because when I try to open it using MS Excel it also shows me a warning about the file type (it says it is a MS Excel 3 Worksheet). Is there any way I can open these files using POI?

public static void convertExcelToCsv() throws IOException {
        try {
            cellGrid = new ArrayList<List<HSSFCell>>();
            FileInputStream myInput = new FileInputStream("D:\\...\\filename.xls");



            POIFSFileSystem myFileSystem = new POIFSFileSystem(myInput);
            HSSFWorkbook myWorkBook = new HSSFWorkbook(myFileSystem);
            HSSFSheet mySheet = myWorkBook.getSheetAt(0);
            Iterator<?> rowIter = mySheet.rowIterator();

            while (rowIter.hasNext()) {
                HSSFRow myRow = (HSSFRow) rowIter.next();
                Iterator<?> cellIter = myRow.cellIterator();
                List<HSSFCell> cellRowList = new ArrayList<HSSFCell>();
                while (cellIter.hasNext()) {
                    HSSFCell myCell = (HSSFCell) cellIter.next();
                    cellRowList.add(myCell);
                }
                cellGrid.add(cellRowList);
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }

回答1:

Got a similar issue. Even if the file had extension .xls, it was NOT an Excel file! Thanks to the comment here of doing "Save-As" in Excel, it might tell what the format is. In my case it was a tab-delimited file so I parsed it without using Apache POI. Hope this helps.