Finding the last row in an Excel spreadsheet

2019-01-26 06:05发布

问题:

I'm trying to find the index of the last row in an excel spreadsheet using Apache's POI for Java.

I thought this should be possible with getLastRowNum() or getPhysicalNumberOfRows() but they don't seem to give the right results. For example, I have a one line spreadsheet and these two functions return a value of 1140. Another two line spreadsheets gets a value of 1162.

The other problem is that I cannot just look for the first empty row, since it may be possible to have empty rows between rows of valid data.

So is there a way to find the index of the last row? I suppose I could make it a requirement to not have empty rows between data, but I was hoping for a better solution.

Edit: For the record using an iterator didn't help. It just iterated over the 1140/1162 supposed rows.

回答1:

I get the expected output using poi-3.6-20091214 and a test.xls having two empty rows followed by three occupied rows:

InputStream myxls = new FileInputStream("test.xls");
Workbook book = new HSSFWorkbook(myxls);
Sheet sheet = book.getSheetAt(0);
System.out.println(sheet.getLastRowNum());

Output: 4



回答2:

you can use the following method to get the original row count.

HSSFSheet worksheet = workbook.getSheet("Role_Mapping");
int rowsNum = worksheet.getPhysicalNumberOfRows();


回答3:

I know how to solve your problem using VBA, but I'm not sure how to get the equivalent information from the Apache POI interface. In VBA, to get the range of used cells in worksheet "Sheet1", use:

Worksheets("Sheet1").UsedRange

This returns a Range object which has properties that provide further information. For example, to get the number of rows in this Range, use:

Worksheets("Sheet1").UsedRange.Rows

Again, I'm not sure whether this is acessible via the POI API, but if not, perhaps it provides a way of executing arbitrary snippets of VBA?



回答4:

The only way to know for sure is to test the rows. Here's the solution I'm using for the same problem:

int lastRowIndex = -1;
if( sheet.getPhysicalNumberOfRows() > 0 )
{
    // getLastRowNum() actually returns an index, not a row number
    lastRowIndex = sheet.getLastRowNum();

    // now, start at end of spreadsheet and work our way backwards until we find a row having data
    for( ; lastRowIndex >= 0; lastRowIndex-- ){
        Row row = sheet.getRow( lastRowIndex );
        if( row != null ){
            break;
        }
    }
}

Note: this doesn't check for rows that appear to be empty but aren't, such as cells that have an empty string in them. For that, you need a more complete solution like:

private int determineRowCount()
{
    this.evaluator = workbook.getCreationHelper().createFormulaEvaluator();
    this.formatter = new DataFormatter( true );

    int lastRowIndex = -1;
    if( sheet.getPhysicalNumberOfRows() > 0 )
    {
        // getLastRowNum() actually returns an index, not a row number
        lastRowIndex = sheet.getLastRowNum();

        // now, start at end of spreadsheet and work our way backwards until we find a row having data
        for( ; lastRowIndex >= 0; lastRowIndex-- )
        {
            Row row = sheet.getRow( lastRowIndex );
            if( !isRowEmpty( row ) )
            {
                break;
            }
        }
    }
    return lastRowIndex;
}

/**
 * Determine whether a row is effectively completely empty - i.e. all cells either contain an empty string or nothing.
 */
private boolean isRowEmpty( Row row )
{
    if( row == null ){
        return true;
    }

    int cellCount = row.getLastCellNum() + 1;
    for( int i = 0; i < cellCount; i++ ){
        String cellValue = getCellValue( row, i );
        if( cellValue != null && cellValue.length() > 0 ){
            return false;
        }
    }
    return true;
}

/**
 * Get the effective value of a cell, formatted according to the formatting of the cell.
 * If the cell contains a formula, it is evaluated first, then the result is formatted.
 * 
 * @param row the row
 * @param columnIndex the cell's column index
 * @return the cell's value
 */
private String getCellValue( Row row, int columnIndex )
{
    String cellValue;
    Cell cell = row.getCell( columnIndex );
    if( cell == null ){
        // no data in this cell
        cellValue = null;
    }
    else{
        if( cell.getCellType() != Cell.CELL_TYPE_FORMULA ){
            // cell has a value, so format it into a string
            cellValue = this.formatter.formatCellValue( cell );
        }
        else {
            // cell has a formula, so evaluate it
            cellValue = this.formatter.formatCellValue( cell, this.evaluator );
        }
    }
    return cellValue;
}


回答5:

You can do this by the following code:

SVTableModel model = new SVTableModel(sheet);
lastRowNum = model.getRowCount();

However, I am trying to do this in Apache POI 3.7 but couldn't find the SVTableModel in the API. This has been removed since 3.2, I guess.



回答6:

I had the same problem before. I use this trick to get the right returned row value:

  1. Open excel file and go the expected sheet.
  2. Select the last row + 1. E.g you have 12 rows with data, then click on row 13 then scroll down to row 2000. Please make sure that all rows are selected.
  3. R-click on selected row and choose Delete.
  4. Rerun the code and check returned value.

This is not the issue of POI library.



回答7:

int total = sheet.getPhysicalNumberOfRows() - sheet.getLastRowNum();