matching dataset with data in csv file in R

2019-08-25 23:32发布

Suppose i have data

mydat=structure(list(id = 1:6, x2 = c(12L, 12L, 12L, 12L, 12L, 12L), 
    x3 = c(12L, 12L, 12L, 12L, 12L, 12L)), .Names = c("id", "x2", 
"x3"), class = "data.frame", row.names = c(NA, -6L))

Also i have file csv

test=read.csv(path,sep=";", dec",")

it has this stucture

test=structure(list(id = 1:5, x2 = c(12L, 12L, 12L, 12L, 12L), x3 = c(12L, 
12L, 12L, 12L, 12L)), .Names = c("id", "x2", "x3"), class = "data.frame", row.names = c(NA, 
-5L))

How can i match these 2 datasets in such way that from mydat were removed observations which have similar id with test?

I.E. output must be

id  x2  x3
6   12  12

cause id 1,2,3,4,5 in mydat is similar with test dataset.

2条回答
放我归山
2楼-- · 2019-08-25 23:56

Using R base.

> mydat[setdiff(mydat$id, test$id), ]
  id x2 x3
6  6 12 12
查看更多
3楼-- · 2019-08-26 00:02

You can use anti_join from dplyr`

 dplyr::anti_join(mydat,test)
Joining, by = c("id", "x2", "x3")
  id x2 x3
1  6 12 12

In base R: you can collapse down the data into strings and compare them:

mydat[!do.call(paste,mydat)%in%do.call(paste,test),]
  id x2 x3
6  6 12 12
查看更多
登录 后发表回答