Removing one table from another in R [closed]

2020-02-02 04:00发布

问题:


Want to improve this question? Add details and clarify the problem by editing this post.

Closed 4 years ago.

I have a data table in R, called A, which has three columns Col1, Col2, and Col3. Another table, called B, also has the same three columns. I want to remove all the rows in table A, for which the pairs (Col1, Col2) are present in table B. I tried, but I am not sure how to do this. I am stuck on this for last few days.

Thanks,

回答1:

We can use anti_join

library(dplyr)
anti_join(A, B, by = c('Col1', 'Col2'))


回答2:

library(data.table)
A = data.table(Col1 = 1:4, Col2 = 4:1, Col3 = letters[1:4])
#   Col1 Col2 Col3
#1:    1    4    a
#2:    2    3    b
#3:    3    2    c
#4:    4    1    d

B = data.table(Col1 = c(1,3,5), Col2 = c(4,2,1))
#   Col1 Col2
#1:    1    4
#2:    3    2
#3:    5    1

A[!B, on = c("Col1", "Col2")]
#   Col1 Col2 Col3
#1:    2    3    b
#2:    4    1    d


回答3:

Here's a go, using interaction:

A <- data.frame(Col1=1:3, Col2=2:4, Col3=10:12)
B <- data.frame(Col1=1:2, Col2=2:3, Col3=10:11)
A
#  Col1 Col2 Col3
#1    1    2   10
#2    2    3   11
#3    3    4   12

B
# Col1 Col2 Col3
#1    1    2   10
#2    2    3   11

byv <- c("Col1","Col2")
A[!(interaction(A[byv]) %in% interaction(B[byv])),]

#  Col1 Col2 Col3
#3    3    4   12

Or create a unique id for each row, and then exclude those that merged:

A[-merge(cbind(A[byv],id=seq_len(nrow(A))), B[byv], by=byv)$id,]