I have 2 df's like this
ID = c('x1','x2','x5')
df1 <- data.frame(ID)
x1 = c(1,2,3,4,5)
x2 = c(11,12,13,14,15)
x3 = c(21,22,23,24,25)
x4 = c(31,32,33,34,35)
x5 = c(41,42,43,44,45)
df2 <- data.frame(x1,x2,x3,x4,x5)
Desired output
x1 x2 x5
1 1 11 41
2 2 12 42
3 3 13 43
4 4 14 44
5 5 15 45
I would like my new dataset to contain only those variables that are identified in df1 as important (i.e: x1,x2,x5) with the values from df2.
In this simple dataset, I know I could do this but just removing x3,x4 in df2 but ideally I would like to apply it to a larger data set where I have more than 100 variables and hence would like to do it programatically.
I can't find a dupe so here goes- simply subset by the values of
as.character(df1$ID)
as inThe reason for
as.character
is in order to avoid sub-setting bydf1$ID
underlying storage mode (integer) rather by it's levelsThough this question is tagged with
data.table
, so we could also do this by reference (if we have adata.table
)- no need to convert tocharacter