I am developing a web application which reads data from excel file (xlsx). I am using POI for reading excel sheet. The problem is when I try to read excel file, the server throws the following error:
The excel file I am trying to read has size of almost 80 MB. Any solution to this problem?
Actually user is uploading file and application after saving file to disk try to read file.
The code snippet I am using for testing is:
File savedFile = new File(file_path);
FileInputStream fis = null;
try {
fis = new FileInputStream(savedFile);
XSSFWorkbook xWorkbook = new XSSFWorkbook(fis);
XSSFSheet xSheet = xWorkbook.getSheetAt(5);
Iterator rows = xSheet.rowIterator();
while (rows.hasNext()) {
XSSFRow row = (XSSFRow) rows.next();
Iterator cells = row.cellIterator();
List data = new ArrayList();
while (cells.hasNext()) {
XSSFCell cell = (XSSFCell) cells.next();
System.out.println(cell.getStringCellValue());
data.add(cell);
}
}
} catch (IOException e) {
e.printStackTrace();
}
One thing that'll make a small difference is when opening the file to start with. If you have a file, then pass that in! Using an InputStream requires buffering of everything into memory, which eats up space. Since you don't need to do that buffering, don't!
If you're running with the latest nightly builds of POI, then it's very easy. Your code becomes:
File file = new File(file_path);
OPCPackage opcPackage = OPCPackage.open(file);
XSSFWorkbook workbook = new XSSFWorkbook(opcPackage);
Otherwise, it's very similar:
File file = new File(file_path);
OPCPackage opcPackage = OPCPackage.open(file.getAbsolutePath());
XSSFWorkbook workbook = new XSSFWorkbook(opcPackage);
That'll free you a bit of memory, which might be enough. If it isn't, and if you can't increase your Java heap space enough to cope, then you'll have to stop using the XSSF UserModel.
In addition to the current, friendly UserModel that you've been using, POI also supports a lower level way to process files. This lower level way is harder to use, as you don't have the various helpers around that require the whole file in memory. However, it's much much more memory efficient, as you process the file in a streaming way. To get started, see the XSSF and SAX (Event API) How-To section on the POI website. Try that out, and also have a look at the various examples.
You should probably change the settings of you JVM. Try to add -Xmx1024 -Xms1024
to the launcher.
You could try to increase your Java heap size.
I think you have to increase the size of the Heap.
You can do it by editing the catalina.bat-file. Add -Xms1024m -Xmx1024m
to the CATALINA_OPTS
variable.
- Xms = initial java heap size
- Xmx = maximum java heap size
EDIT:
from Catalina.bat
rem CATALINA_OPTS (Optional) Java runtime options used when the "start",
rem "run" or "debug" command is executed.
rem Include here and not in JAVA_OPTS all options, that should
rem only be used by Tomcat itself, not by the stop process,
rem the version command etc.
rem Examples are heap size, GC logging, JMX ports etc.
I have solved the problem through change in implementation. Actually firstly I was fetching all data from Excel file and data was being stored in ArrayList type. After that I was inserting data into DB and that was the real problem. Now I am not storing data at all. As I get one record from ResultSet, I insert it into DB immediately instead of storing it into arraylist. I know this one by one insertion is not a good approach but for time being I am using this approach. In future if I find better one, I definitely switch to that one. Thanks to all.
Improvement to your current approach could be to read around 100 lines (experiment with this figure to get optimum value) from excel and do a batch update in database. This will be more faster.
Also you can possibly perform some optimizations in your code, move the list creation out of outer loop (loop for reading row data)
List data = new ArrayList();
Read contents of all the cells present in a row in a string buffer (possibly delimited with "comma") and then add it to arraylist "data"
You are adding an object of type XSSFRow
to the arraylist. There is no point in storing the whole object of excel cell. Take out its contents and discard the object.
Later before inserting the contents in to Database you can split the delimited cell contents and perform insertion.
Hope this helps!
You better store them in file and try to load them in database at then end.
This will avoid single insert a