Interactively work with list objects that take up

2020-07-07 10:36发布

问题:

I have recently discovered the wonders of the packages bigmemory, ff and filehash to handle very large matrices.

How can I handle very large (300MB++) lists? In my work I work with these lists all day every day. I can do band-aid solution with save() & load() hacks everywhere but I would prefer a bigmemory-like solution. Something like a bigmemory bigmatrix would be ideal, where I work with it basically identically to a matrix except it takes up somethign like 660 bytes in my RAM.


These lists are mostly >1000 length lists of lm() objects (or similar regression objects). For example,

Y <- rnorm(1000) ; X <- rnorm(1000)
A <- lapply(1:6000, function(i) lm(Y~X))
B <- lapply(1:6000, function(i) lm(Y~X))
C <- lapply(1:6000, function(i) lm(Y~X))
D <- lapply(1:6000, function(i) lm(Y~X))
E <- lapply(1:6000, function(i) lm(Y~X))
F <- lapply(1:6000, function(i) lm(Y~X))

In my project I will have A,B,C,D,E,F-type lists (and even more than this) that I have to work with interactively.

If these were gigantic matrices there is a tonne of support. I was wondering if there was any similar support in any package for large list objects.

回答1:

You can store and access lists on disk using the filehash package. This should work (if rather slowly on my machine...):

Y <- rnorm(1000) ; X <- rnorm(1000)

# set up disk object
library(filehash)
dbCreate("myTestDB")
db <- dbInit("myTestDB")

db$A <- lapply(1:6000, function(i) lm(Y~X))
db$B <- lapply(1:6000, function(i) lm(Y~X))
db$C <- lapply(1:6000, function(i) lm(Y~X))
db$D <- lapply(1:6000, function(i) lm(Y~X))
db$E <- lapply(1:6000, function(i) lm(Y~X))
db$F <- lapply(1:6000, function(i) lm(Y~X))

List items can be accessed using the [ function. See here for more details: http://cran.r-project.org/web/packages/filehash/vignettes/filehash.pdf