Unlike questions I've found, I want to get the unique of two columns without order.
I have a df:
df<-cbind(c("a","b","c","b"),c("b","d","e","a"))
> df
[,1] [,2]
[1,] "a" "b"
[2,] "b" "d"
[3,] "c" "e"
[4,] "b" "a"
In this case, row 1 and row 4 are "duplicates" in the sense that b-a is the same as b-a.
I know how to find unique of columns 1 and 2 but I would find each row unique under this approach.
If all of the elements are strings (heck, even if not and you can coerce them), then one trick is to create it as a data.frame and use some of
dplyr
's tricks on it.The
$key
column should now tell you the repeats.You could use
igraph
to create a undirected graph and then convert back to a data.frameThere are lot's of ways to do this, here is one:
One gives the unique rows, the other gives the mask.
If it's just two columns, you can also use
pmin
andpmax
, like this:A similar approach using "dplyr" might be: