I have a problem using data.table: How do I convert column classes? Here is a simple example: With data.frame I don't have a problem converting it, with data.table I just don't know how:
df <- data.frame(ID=c(rep("A", 5), rep("B",5)), Quarter=c(1:5, 1:5), value=rnorm(10))
#One way: http://stackoverflow.com/questions/2851015/r-convert-data-frame-columns-from-factors-to-characters
df <- data.frame(lapply(df, as.character), stringsAsFactors=FALSE)
#Another way
df[, "value"] <- as.numeric(df[, "value"])
library(data.table)
dt <- data.table(ID=c(rep("A", 5), rep("B",5)), Quarter=c(1:5, 1:5), value=rnorm(10))
dt <- data.table(lapply(dt, as.character), stringsAsFactors=FALSE)
#Error in rep("", ncol(xi)) : invalid 'times' argument
#Produces error, does data.table not have the option stringsAsFactors?
dt[, "ID", with=FALSE] <- as.character(dt[, "ID", with=FALSE])
#Produces error: Error in `[<-.data.table`(`*tmp*`, , "ID", with = FALSE, value = "c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2)") :
#unused argument(s) (with = FALSE)
Do I miss something obvious here?
Update due to Matthew's post: I used an older version before, but even after updating to 1.6.6 (the version I use now) I still get an error.
Update 2: Let's say I want to convert every column of class "factor" to a "character" column, but don't know in advance which column is of which class. With a data.frame, I can do the following:
classes <- as.character(sapply(df, class))
colClasses <- which(classes=="factor")
df[, colClasses] <- sapply(df[, colClasses], as.character)
Can I do something similar with data.table?
Update 3:
sessionInfo() R version 2.13.1 (2011-07-08) Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.6.6
loaded via a namespace (and not attached):
[1] tools_2.13.1
I provide a more general and safer way to do this stuff,
The function
..
makes sure we get a variable out of the scope of data.table; set_colclass will set the classes of your cols. You can use it like this:Raising Matt Dowle's comment to Geneorama's answer (https://stackoverflow.com/a/20808945/4241780) to make it more obvious (as encouraged).
Also, noted in another of Matt's comment see https://stackoverflow.com/a/33000778/4241780 for more info.
I tried several approaches.
, or otherwise
This is a BAD way to do it! I'm only leaving this answer in case it solves other weird problems. These better methods are the probably partly the result of newer data.table versions... so it's worth while to document this hard way. Plus, this is a nice syntax example for
eval
substitute
syntax.which gives you
For a single column:
Using
lapply
andas.character
:Try this