I am using POI in my J2EE web application to generate a workbook. However, i find that POI takes around 3 mins to create a workbook with 25K rows(with around 15 columns each). Is this a POI performance issue , or is it justified to take that much of time? Are there other APIs known for better performance ?
问题:
回答1:
I would be very surprised to see POI take that much time to generate such a file. I just generated a sheet with 30000 rows x 10 cells in about 18s (no formatting, to be fair). The cause might be one of the following:
- POI logging might be turned on, as described here
- you are running from swap memory
- your VM available heap might be very low
回答2:
The performance of writing large files with POI can be heavily reduced if you used the 'streaming' POI API instead of the standard one. Indeed by default POI will keep all your data in memory before writing all in one go at the end. The memory footprint of this can be ridiculously large for big files. Instead using the streaming API you can control how memory is used and data is written to disk progressively.
In order to create a streaming workbook, use something like :
SXSSFWorkbook book = new SXSSFWorkbook();
book.setCompressTempFiles(true);
SXSSFSheet sheet = (SXSSFSheet) book.createSheet();
sheet.setRandomAccessWindowSize(100);// keep 100 rows in memory, exceeding rows will be flushed to disk
// ...
回答3:
If none of the other answers work out, see if Andy Khan's JExcel will be better. I've found it to be far superior to POI for dealing with Excel in Java.
回答4:
We also use POI in our web app and do not have any performance issue with it - although our generated documents are far smaller than yours. I would first check if POI is the real issue here. Try to generate those documents without the J2EE-overhead (Unit-Test) and measure the performance. You could also monitor the load and memory usage on your J2EE server to see if the problems come from some suboptimal system settings.
回答5:
I've compared Apache POI with JExcel library. It seems that JExcel is about up to 4x faster than Apache POI but memory consumption seems to be more or less the same:
@Test
public void createJExcelWorkbook() throws Exception {
WritableWorkbook workbook = Workbook.createWorkbook(new File("jexcel_workbook.xls"));
WritableSheet sheet = workbook.createSheet("sheet", 0);
for ( int i=0; i < 65535; i++) {
for ( int j=0; j < 10; j++) {
Label label = new Label(j, i, "some text " + i + " " + j);
sheet.addCell(label);
}
}
workbook.write();
workbook.close();
}
@Test
public void createPoiWorkbook() throws Exception {
Workbook wb = new HSSFWorkbook();
Sheet sheet = wb.createSheet("sheet");
for ( int i=0; i < 65535; i++) {
Row row = sheet.createRow(i);
for ( int j=0; j < 10; j++) {
Cell cell = row.createCell(j);
cell.setCellValue("some text " + i + " " + j);
}
}
FileOutputStream fileOut = new FileOutputStream("poi_workbook.xls");
wb.write(fileOut);
fileOut.close();
}
I've tested it with JExcel version 2.6.12 and Apache POI version 3.7. You need to download the latest library versions yourself and run the simple tests above to get more accurate numbers.
<dependency org="org.apache.poi" name="poi" rev="3.7"/>
<dependency org="net.sourceforge.jexcelapi" name="jxl" rev="2.6.12"/>
Note: there is a limit in Apache POI of 65535 rows per sheet.