How do you refer to variables in a data.table
if the variable names are stored in a character vector? For instance, this works for a data.frame
:
df <- data.frame(col1 = 1:3)
colname <- "col1"
df[colname] <- 4:6
df
# col1
# 1 4
# 2 5
# 3 6
How can I perform this same operation for a data.table, either with or without :=
notation? The obvious thing of dt[ , list(colname)]
doesn't work (nor did I expect it to).
*This is not an answer really, but I don't have enough street cred to post comments :/
Anyway, for anyone who might be looking to actually create a new column in a data table with a name stored in a variable, I've got the following to work. I have no clue as to it's performance. Any suggestions for improvement? Is it safe to assume a nameless new column will always be given the name V1?
Notice I can reference it just fine in the sum() but can't seem to get it to assign in the same step. BTW, the reason I need to do this is colname will be based on user input in a Shiny app.
Two ways to programmatically select variable(s):
with = FALSE
:'dot dot' (
..
) prefix:For further description of the 'dot dot' (
..
) notation, see New Features in 1.10.2 (it is currently not described in help text).To assign to variable(s), wrap the LHS of
:=
in parentheses:The latter is known as a column plonk, because you replace the whole column vector by reference. If a subset
i
was present, it would subassign by reference. The parens around(colname)
is a shorthand introduced in version v1.9.4 on CRAN Oct 2014. Here is the news item:See also Details section in
?`:=`
:And to answer further question in comment, here's one way (as usual there are many ways) :
or, you might find it easier to read, write and debug just to
eval
apaste
, similar to constructing a dynamic SQL statement to send to a server :If you do that a lot, you can define a helper function
EVAL
:Now that
data.table
1.8.2 automatically optimizesj
for efficiency, it may be preferable to use theeval
method. Theget()
inj
prevents some optimizations, for example.Or, there is
set()
. A low overhead, functional form of:=
, which would be fine here. See?set
.For multiple columns and a function applied on column values.
When updating the values from a function, the RHS must be a list object, so using a loop on
.SD
withlapply
will do the trick.The example below converts integer columns to numeric columns