The situation is as follows; I have a simple program which uses the Apache Poi Library to add one row of data at the end of the an exisiting xlsx file. See below
File file = new File(input);
XSSFWorkbook workbook = new XSSFWorkbook(file);
XSSFSheet sheet = workbook.getSheetAt(0);
XSSFRow row = sheet.createRow(sheet.getLastRowNum() + 1);
After this I will iterate over the row and set the CellValues. But the problem is that on the second line of the code, as shown above, I get an out of memory error. Is there a way to add a row of data to the existing xlsx file without having to read the file fully?
(not enough reputation to add this as a comment) Have you tried using SXSSFWorkbook instead of XSSFWorkbook?
You can try XSSF and SAX (Event API).
If getting the
XSSFWorkbook
fails because of out-of-memory error and the need is to read and write the workbook, then neitherSXSSF
norSAX
parser will help. The one is only for writing. The other is only for reading.Both approaches in follow needs knowledge about the
*.xlsx
file format which is Office Open XML. In general a*.xlsx
file is aZIP
archive containingXML
files and other files in a special directory structure. So one can unzip the*.xlsx
file using aZIP
software to have a look at theXML
files. The file format was first standardized by Ecma. So for further recherches I prefer Ecma Markup Language Reference. For example Row.The
ReadAndWriteTest.xlsx
used in both examples must have at least one worksheet and the first worksheet must have at least one row.One approach could be using the
DOM
methods of XMLBeans. My favorite reference for this is grepcode.Example:
This code writes 10 new Rows in sheet1 of
ReadAndWriteTest.xlsx
without opening the whole workbook. But it must at least opening and parsing the sheet1 and theSharedStringsTable
. If even this fails, then this approach is not usable.Another approach could be using StAX. This API can read and write XML event driven. And it uses streaming.
Example:
This code also writes 10 new Rows in sheet1 of
ReadAndWriteTest.xlsx
without opening the whole workbook. But it must at least opening and parsing theSharedStringsTable
. If even this fails, then this approach is also not usable. But of course even theSharedStringsTable
could be streamed using StAX. But as you see in example with generating the rows and cells, this is much more complicated. So using theSharedStringsTable
makes things easier in this example.