How can I get the complement of vector y in vector

2019-04-11 00:19发布

问题:

That's x \ y using mathematical notation. Suppose

x <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,1,1,1,3) 
y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1)

How can I get a vector with ALL the values in x that are not in y. i.e the result should be:

2,1,1,3

There is a similar question here. However, none of the answers returns the result that I want.

回答1:

Here a solution using pmatch (this gives the "complement" as you require):

x <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,1,1,1,3)
y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1)
res <- x[is.na(pmatch(x,y))]

From pmatch documentation:

"If duplicates.ok is FALSE, values of table once matched are excluded from the search for subsequent matches."



回答2:

How about this:

R> x[x!=y]
[1] 2 1 1 1 3
Warning message:
In x != y : longer object length is not a multiple of shorter object length
R>

This is difficult problem, I think, as you are mixing values and positions. The easier solution relies on one of the 'set' functions in R:

R> setdiff(x,y)
[1] 2 3

but that uses only values and not position.

The problem with the answer I gave you is the implicit use of recycling and the warning it triggered: as your x is longer than your y, the first few values of y get reused. But recycling is considered "clean" on when the longer vector has an integer-multiple length of the length of the shorter vector. But that is not the case here, and hence I am not sure we can solve your problem all that cleanly.



回答3:

If I understand the problem, you can use table to compute the difference in the number of elements in each set and then create a vector based on the difference of those counts (note that this won't necessarily give you the order you gave in your question).

> diffs <- table(x) - table(factor(y, levels=levels(factor(x))))
> rep(as.numeric(names(diffs)), ifelse(diffs < 0, 0, diffs))
[1] 1 1 2 3


标签: r vector