I'm just starting to learn data.table
and working my way through the vignettes--although I'm simultaneously using it in a project. How do I replace some plyr
syntax with data.table
?
input <- data.table(ID = c(37, 45, 900), a1 = c(1, 2, 3), a2 = c(43, 320,390),
b1 = c(-0.94, 2.2, -1.223), b2 = c(2.32, 4.54, 7.21), c1 = c(1, 2, 3),
c2 = c(-0.94, 2.2, -1.223))
# simple user defined function that conveys my problem
func <- function(x, num) {
x <- data.table(x)
new_b <- x$b1[1]
x2 <- within(x[1,], {
b1 = new_b
b2 = 51
})
imp <- rbindlist(replicate(num, x2, simplify= FALSE))
return(rbindlist(list(x, imp)))
}
# wrapper function
wrap_func <- function(dat, num= 5, plyr= FALSE) {
if (plyr == TRUE) {
return(plyr::ddply(dat, .var= "ID", .fun= func, num= num))
} else {
return(dat[, lapply(.SD, FUN= func, num), by= ID])
}
}
plyr
works
wrap_func(dat=input, 5, plyr=TRUE)
what is the data.table
syntax?
wrap_func(dat=input, num=5, plyr=FALSE) # gives error
Thanks in advance!!
Update:
Based on @Frank's suggestion in the comments, I benchmarked this on my real data / code. Here, impute_zero_resp_all
is the real equivalent of wrap_func
in the example.
I start with a dataset that has ~50k rows and 1800 groups; imputation is done by group resulting in a dataset with ~170k rows and the same 1800 groups:
vec1 <- vec2 <- vector(mode= "numeric", length= 50)
for (i in 1:50) {
vec1[i] <- system.time(impute_zero_resp_all(dat= test_dat2))[3] #DT
vec2[i] <- system.time(impute_zero_resp_all2(dat= test_dat2))[3] #PLYR
}
summary(vec1); summary(vec2)
Min. 1st Qu. Median Mean 3rd Qu. Max.
22.62 22.76 22.81 22.84 22.84 23.72
Min. 1st Qu. Median Mean 3rd Qu. Max.
27.19 27.35 27.40 27.49 27.45 30.07
quantile(vec1, seq(0,1,.1))
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
22.620 22.670 22.728 22.760 22.786 22.810 22.824 22.840 22.870 22.917 23.720
quantile(vec2, seq(0,1,.1))
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
27.190 27.289 27.330 27.357 27.376 27.400 27.424 27.440 27.476 27.522 30.070
sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1