Keeping only certain rows of a data frame based on

2019-01-12 06:14发布

问题:

I have a data frame with an ID column and a few columns for values. I would like to only keep certain rows of the data frame based on whether or not the value of ID at that row matches another set of values (for instance, called "keep").

For simplicity, here is an example:

df <- data.frame(ID = sample(rep(letters, each=3)), value = rnorm(n=26*3))
keep <- c("a", "d", "r", "x")

How can I create a new data frame consisting of rows that only have IDs that match those of keep? I can do this for just one letter by using the which() function, but with multiple letters I get warning messages and incorrect returns. I know I could run a for loop through the data frame and extrapolate that way, but I'm wondering if there is a more elegant and efficient way of going about this. Thanks in advance.

回答1:

Try df[df$ID %in% keep, ] or subset(df, ID %in% keep) -- see the help page for sets.

Edit: Also, if this were for a single letter, you could write e.g. df[df$ID == "a", ] instead of using which().



标签: r subset