I have the following code that I use to get the sheet names of an excel file(.xlsx)
XSSFWorkbook workBookXlsx = new XSSFWorkbook(new FileInputStream(pathToFile));
ArrayList<String> sheetNames = new ArrayList<>();
int numberOfSheets = workBookXlsx.getNumberOfSheets();
for (int i = 0; i < numberOfSheets; i++) {
sheetNames.add(workBookXlsx.getSheetAt(i).getSheetName());
}
workBookXlsx = null;
The issue I have with the above code is that it takes a lot of memory(~700MB) & a long time(5-6s) to create the XSSFWorkbook
for a file of size 9MB. Even setting the workBookXlsx
to null
doesn't release the memory taken by the javaw
(I know gc
may or maynot be called & JVM wont release memory just because I have set a variable to null)
I did go through the documentation of Workbook, XSSFWorkbook & from what I understood, there is no method that will help me get the sheet names with low memory imprint.
The one solution I have found is to manually unzip the .xlsx
file and read the contents of the .\xl\woorkbook.xml
to get the sheet names and the r:id
Is there an API for getting the sheet names in an .xlsx
file without large memory imprint?
To show what @Gagravarr probably meant with his comment:
The XSSFReader contains a method XSSFReader.getSheetsData which "Returns an Iterator which will let you get at all the different Sheets in turn. Each sheet's InputStream is only opened when fetched from the Iterator. It's up to you to close the InputStreams when done with each one.". But as often this is not the whole truth. In truth it returns a XSSFReader.SheetIterator which has a method XSSFReader.SheetIterator.getSheetName to get the sheet names.
Example:
import java.io.InputStream;
import java.io.FileInputStream;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import java.util.Iterator;
public class ExcelXSSFGetSheetNamesXSSFReader {
public static void main(String[] args) throws Exception {
OPCPackage pkg = OPCPackage.open(new FileInputStream("Example.xlsx"));
XSSFReader r = new XSSFReader( pkg );
Iterator<InputStream> sheets = r.getSheetsData();
if (sheets instanceof XSSFReader.SheetIterator) {
XSSFReader.SheetIterator sheetiterator = (XSSFReader.SheetIterator)sheets;
while (sheetiterator.hasNext()) {
InputStream dummy = sheetiterator.next();
System.out.println(sheetiterator.getSheetName());
dummy.close();
}
}
pkg.close();
}
}
Conclusion: Currently you cannot work with apache poi
only by trusting the API documentation. Instead you must always have a look at the source code.