Issue reading in a cell from Excel with Apache POI

2019-02-28 22:29发布

问题:

I am trying to use Apache POI to read in old (pre-2007 and XLS) Excel files. My program goes to the end of the rows and iterates back up until it finds something that's not either null or empty. Then it iterates back up a few times and grabs those cells. This program works just fine reading in XLSX and XLS files made in Office 2010.

I get the following error message:

Exception in thread "main" java.lang.NumberFormatException: empty String
    at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source)
    at java.lang.Double.parseDouble(Unknown Source)

at the line:

num = Double.parseDouble(str);

from the code:

str = cell.toString();

if (str != "" || str != null) {
    System.out.println("Cell is a string");
    num = Double.parseDouble(str);
} else {
    System.out.println("Cell is numeric.");
    num = cell.getNumericCellValue();
}

where the cell is the last cell in the document that's not empty or null. When I try to print the first cell that's not empty or null, it prints nothing, so I think I'm not accessing it correctly.

回答1:

It would be better to evaluate the cell type and then do what you need. I use this code to handle the cell data (check that I even handle blank cells):

switch (cell.getCellType()) {
    case Cell.CELL_TYPE_STRING:
        str = cell.toString().trim();
        break;
    case Cell.CELL_TYPE_NUMERIC:
        if (DateUtil.isCellDateFormatted(cell)) {
            //you should change this to your application date format
            objSimpleDateFormat = new SimpleDateFormat("dd/MM/yyyy");
            str = objSimpleDateFormat.format(cell.getDateCellValue());
        } else {
            num = cell.getNumericCellValue();
            str = String.valueOf(cell.getNumericCellValue());
        }
        break;
    case Cell.CELL_TYPE_BLANK:
        str = "";
        break;
    case Cell.CELL_TYPE_ERROR:
        str = "";
        break;
    case Cell.CELL_TYPE_BOOLEAN:
        str = String.valueOf(cell.getBooleanCellValue());
    break;
}


回答2:

Perhaps the reason why you're reading in blank cells is due to the usage of the right subcomponent of Apache POI to read Excel files. Use HSSF (Horrible SpreadSheet Format) for XLS formats and XSSF (XML SpreadSheet Format) for XLSX formats.


As for the code itself, you may want to refine your boolean expression. The way you have it now, since you're using the or operator (||),

  • the first part of your if statement will execute if str != null, and
  • the else part of your if statement will execute if str == null.

The first part of the if statement will throw a NumberFormatException in calling Double.parseDouble if str cannot be parsed as a number.

Maybe the following code snippet will help you:

if (str == null || str.trim().isEmpty()) {
    // handle null and empty strings
} else if (cell.getType() == Cell.CELL_TYPE_NUMERIC) {
    System.out.println("Cell is numeric.");
    num = cell.getNumericCellValue();
} else {
    // If the cell is not numeric, Double.parseDouble(str) 
    // will most likely throw a NumberFormatException
}

To find out more about Cell, read its Javadoc.



回答3:

It would be nice if we all know what code line number caused the exception.

I suspect your first line of code is the cause. Object cell could be null, and a null address can not be converted to a String type. You can check by code.

Note: It's good that the code works with Office 2010 but I think this type of problem can happen in any Excel version.