how find or match one data frame as a subset(full)

2019-08-21 02:19发布

问题:

I have two data frames df1 and df2 given below. df1 is

  c1       c2   c3  c4
   B  2.34000  1.00  I
   A 14.43000  2.10  J
   D  3.45515  1.00  K
   B  2.50000  2.09   
   A  2.44000  1.10  K
   K  5.00000  1.09  L

df2 is:

  c1    c2   c3
   B  2.34  1.00
   A 14.43  2.10
   D  3.43  1.00
   B  2.50  2.09
   E  5.00  1.09
   A  2.44  1.10

the requirement here is like this: there is matching(or comparison) between these two data frames. if df2 completely found (that means the content of df2 matched with any subset of df1 irrespective of the order) in df1(either exactly matched with df2 or subset of df1 matched with df2) then output is true. If not matched then return false.

I tried following methods:

1. left_join(df2,df1)
2. merge(df2,df1)
3. inner_join(df2,df1)
4. dd1[dd1$c1 %in% dd$c1,]

all the above methods give that data which is common in between both but not give results as per requirements.

Please suggest me some solution for the same.

回答1:

You can use match and interaction like:

df1 <- read.table(text="c1       c2   c3  c4
   B  2.34000  1.00  I
   A 14.43000  2.10  J
   D  3.45515  1.00  K
   B  2.50000  2.09  NA
   A  2.44000  1.10  K
   K  5.00000  1.09  L", header=T)

df2 <- read.table(text="c1    c2   c3
   B  2.34  1.00
   A 14.43  2.10
   D  3.43  1.00
   B  2.50  2.09
   E  5.00  1.09
   A  2.44  1.10", header=T)

!any(is.na(match(interaction(df2), interaction(df1[names(df2)]))))
#[1] FALSE

#And packed in a function
"%completelyFoundIn%" <- function(x, y) {!any(is.na(match(interaction(x), interaction(y[names(x)]))))}

df2 %completelyFoundIn% df1
#[1] FALSE

df2[c(1,2,4,6),] %completelyFoundIn% df1
#[1] TRUE

df2[-5,c(1,3)] %completelyFoundIn% df1
#[1] TRUE