Searching for a list of string in a dataframe in R

2019-08-20 08:27发布

问题:

I have a list of names, and a data.frame with many different columns. How can I retrieve rows in the data frame that their row.name is one of the names in my list?

For example if the row.names in my data frame has many rows, including TC09001536.hg.1 , TC03002852.hg.1 , and TC18000664.hg.1 names, which are saved in list called Top.list. Assuming my data frame is called df then I tried:

test <- df[grep(Top.list, df$cluster_id),]

to look within cluster_id column and if matches the names in my list then give me whole rows.

回答1:

This should work:

test <- df[unlist(lapply(Top.list, function(x) grep(x, df$cluster_id, fixed = TRUE))),]

The lapply(Top.list, function(x) grep(x, df$cluster_id, fixed = TRUE)) part generates a list with vectors of matching row numbers for each of your words, the unlist combines the vectors to one vector, from which your dataframe will be subsetted.



标签: r dataframe grep