Delete rows that exist in another data frame?

2019-01-05 02:47发布

I have the two following data frames (example):

df1:

name    profile    type    strand
A       4.5        1       +
B       3.2        1       +
C       5.5        1       +
D       14.0       1       -
E       45.1       1       -
F       32.8       1       -
G       19.9       1       +

df2:

name
A
B
C
G

I would like to delete the rows in df1 for which df1$name = df2$name to get the following:

Output:

name    profile    type    strand
D       14.0       1       -
E       45.1       1       -
F       32.8       1       -

If anyone could tell me which piece of code to use it would be a lot of help, seemed simple at first but I've been messing it up since yesterday.

标签： r dataframe duplicate-removal delete-row corresponding-records

3条回答

对你真心纯属浪费

2楼-- · 2019-01-05 03:45

You need the %in% operator. So,

df1[!(df1$name %in% df2$name),]

should give you what you want.

df1$name %in% df2$name tests whether the values in df1$name are in df2$name
The ! operator reverses the result.

0人赞添加讨论(0) 举报

老娘就宠你

3楼-- · 2019-01-05 03:45

This is sometimes called an anti-join:

library(dplyr)
anti_join(df1, df2, by = "name")

0人赞添加讨论(0) 举报

啃猪蹄的小仙女

4楼-- · 2019-01-05 03:48

df1[!(as.character(df1$jobId) %in% as.character(df2$name)), ]

I had to add as.character to my execution because name is not a character but a factor instead. Isn't %in% supposed to convert this directly?

0人赞添加讨论(0) 举报

Delete rows that exist in another data frame?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间