-->

xlConnect R use of JVM memory

2019-01-27 05:46发布

问题:

I'm running into a problem with JVM memory using XLConnect (Mirai Solutions) in R.

Data loads into R just fine using loadWorkbook or readWorksheetFromFile, but larger data (data frames about 3MB) get stuck while being written to the JVM during export with any of the export functions (writeNamedRegion, writeWorksheetToFile, etc.), and R stops responding.

I've reset the java parameters using options(java.parameters = "-Xmx1500m"), and this increased the size of the data frames I was able to export to Excel, but R still slows around 1MB and won't work around 3MB.

I'm on a 64-bit Windows 7 system with 32-bit Office software and 32-bit Java on a machine with 8GB RAM. 3MB doesn't seem very big compared to the ~750 MB free memory in the JVM that is supposedly there at the beginning of export (checked with xlcMemoryReport).

Ideas?

回答1:

Given your reference value of 3MB I'm concluding you are trying to write a data.frame with numeric variables of dimension 10 columns x 40k rows (or comparable; the object.size of such a data.frame results in approx. 3.2MB).

Depending on if you are trying to write xls (BIFF8) or xlsx (OOXML) files, memory requirements can be quite different. Reason being that xlsx documents are actually compressed XML files and Apache POI (which is the underlying Java API that is used by XLConnect) uses xmlbeans to manipulate those - this can be quite memory intense. BIFF8 on the other hand is a binary data format and requires less memory.

You should be able to write a data.frame of before mentioned dimensions to an xlsx document with a max. heap size of 1024m, i.e. the following worked fine for me:

options(java.parameters = "-Xmx1024m") # required BEFORE any JVM is initialized in R
require(XLConnect)
tmp = as.data.frame(matrix(rnorm(4e5), ncol = 10))
writeWorksheetToFile(tmp, file = "test.xlsx", sheet = "test")

... using R 2.15.1 32-bit with RStudio, XLConnect 0.2-0 and JRE 1.6.0_25 (running on 32-bit Windows XP with 4GB of RAM).

For those interested in a more in-depth discussion of memory usage on the Apache POI side there is the following discussion: http://apache-poi.1045710.n5.nabble.com/HSSF-and-XSSF-memory-usage-some-numbers-td4312784.html