I am interested in exploring how R can handle data out-of-memory. I've found the bigmemory
package and friends (bigtabulate
and biganalytics
), but was hoping that someone could point me to a worked out example that uses file backing with these packages. Any other out-of-memory tips would also be appreciated.
相关问题
- R - Quantstart: Testing Strategy on Multiple Equit
- Using predict with svyglm
- Reshape matrix by rows
- Extract P-Values from Dunnett Test into a Table by
- split data frame into two by column value [duplica
相关文章
- How to convert summary output to a data frame?
- How to plot smoother curves in R
- Paste all possible diagonals of an n*n matrix or d
- ess-rdired: I get this error “no ESS process is as
- How to use doMC under Windows or alternative paral
- dyLimit for limited time in Dygraphs
- Saving state of Shiny app to be restored later
- How to insert pictures into each individual bar in
Any other out-of-memory tips would also be appreciated.
I frequently work with large datasets. Even though my code has been optimized, I still launch Amazon EC2 instances from time to time because it gives me access to far more resources than on my desk. For example, an instance with 26 ECUs, 8 cores, and 68 gigs of RAM only costs about a $0.80-1.00 per hour (spot instance pricing).
If that seems reasonable, you can launch a public machine image that already has R and do this job in no time.
Charlie, just email Mike and Jay, they have a number of examples working around the ASA 'flights' database example from a year or two ago.
Edit: In fact, the Documentation tab has what I had in mind; the scripts are also on the site.
Take a look at "CRAN Task View: High-Performance and Parallel Computing with R". There is a chapter "Large memory and out-of-memory data" where severel solutions are mentioned. For example package
ff
.