I try to use bigmemory
package in R and I'm stuck in the very beginning. I do:
temp <- matrix(paste("a",1:10), 5, 2)
and get a character matrix. That's OK. But then I try:
x <- as.big.matrix(temp, type="char")
and I get a matrix full of NA and the following message:
Assignment will down cast from double to char
Hint: To remove this warning type: options(bigmemory.typecast.warning=FALSE)
Warning messages:
1: In as.big.matrix(temp, type = "char") : Casting to numeric type
2: In matrix(as.numeric(x), nrow = nrow(x), dimnames = dimnames(x)) :
NAs introduced by coercion
3: In SetElements.bm(x, i, j, value) :
I'm not sure what's going on but it looks big.matrix tries to convert all my texts in to numbers despite type = "char"
. How to make it work?
This is a bit of a misnomer - big.matrix objects only store numeric data types. The 'char' type is a C++ data type used to store integer values that represent ASCII character codes (a single character, not a character string). To store character strings in a big.matrix, you'll have to re-code the strings as numeric values (or convert to factors, then to numeric values).
If you need to store character data in a very large data set, you may want to look into the 'ff' package. In my experience it has a steep learning curve and the documentation is somewhat lacking, but it does have that functionality.
For further details on dealing with large data sets, you can check out the CRAN Task View here: http://cran.r-project.org/web/views/HighPerformanceComputing.html