While constructing expressions to put in the j
-slot of a [.data.table
call, it would often be helpful to be able to examine and play around with the contents of .SD
.
This naive attempt doesn't work...
library(data.table)
DT = data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
DT[, browser(), by=x]
# Called from: `[.data.table`(DT, , browser(), by = x)
Browse[1]>
Browse[1]> .SD
# NULL data.table
... even though a variable named .SD
and several others related to the current data.table subset are all present in the local environment
Browse[1]> ls(all.names = TRUE)
# [1] ".BY" ".GRP" ".I" ".iSD" ".N" ".SD"
# [7] "Cfastmean" "mean" "print" "x"
Browse[1]> .N
# [1] 3
Browse[1]> .I
# [1] 4 5 6
Using .I
, I can view something +/- like .SD
, but it would be nice to be able to directly access its value:
Browse[1]> DT[.I]
# x y v
# 1: b 1 4
# 2: b 3 5
# 3: b 6 6
My questions: Why is the expected value of .SD
not directly available from within a browser()
call (while .I
, .N
, .GRP
and .BY
are)? Is there some alternative way to access the value of .SD
?
Updated in light of Matthew Dowle's comments:
It turns out that .SD
is, internally, the environment within which all j
expressions are evaluated, including those which don't explicitly reference .SD
at all. Filling it with all of DT
's columns for each subset of DT
is not cheap, timewise, so [.data.table()
won't do so unless it really needs to.
Instead, making great use of R's lazy-evaluation of arguments, it previews the unevaluated j
expression, and only adds to .SD
columns that are referenced therein. If .SD
itself is mentioned, it adds all of DT
's columns.
So, to view .SD
, just include some reference to it in the j
-expression. Here is one of many expressions that will work:
library(data.table)
DT = data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
## This works
DT[, if(nrow(.SD)) browser(), by=x]
# Called from: `[.data.table`(DT, , if (nrow(.SD)) browser(), by = x)
Browse[1]> .SD
# y v
# 1: 1 1
# 2: 3 2
# 3: 6 3
And here are a couple more:
DT[,{.SD; browser()}, by=x]
DT[,{browser(); .SD}, by=x] ## Notice that order doesn't matter
To see for yourself that .SD
just loads columns needed by the j
-expression, run these each in turn (typing .SD
when entering the browser environment, and Q
to leave it and return to the normal command-line):
DT[, {.N * y ; browser()}, by=x]
DT[, {v^2 ; browser()}, by=x]
DT[, {y*v ; browser()}, by=x]