How to create a binary vector with 1 if elements a

2019-01-20 09:54发布

问题:

I would like to create a so-called matching vector consisting of binaries. All numbers should be zero unless elements belong to the same variable.

Here's an example:

dataset=("a","b","c","d","x","y","z")
var1=c("a","b","y","z")
var2=c("c","d","x")

Thus, I have a dataset with all the variables in the first line. Now I create two groups: var1 and var2.

The matching vector for the element "a" is supposed to look like:

matching_a=c(1,1,0,0,0,1,1)

The numbers correspond to my dataset. If the variables in my dataset are in the same group, there should be a 1 in my matching vector, and a 0 otherwise.

However, my actual data set is too big to do it manually. Does anyone understand what I wanna do?

回答1:

Using ifelse function and %in% operator.

matching_a <-  ifelse(dataset %in% var1, 1, 0)

matching_a
# [1] 1 1 0 0 0 1 1


回答2:

> output1 = 1 * dataset %in% var1
> output2 = 1 * dataset %in% var2
> output1
[1] 1 1 0 0 0 1 1
> output2
[1] 0 0 1 1 1 0 0

Also, if you have many more matches to make than var1 and var2, it'll be useful to extend this to something like:

> vars = list(var1, var2)
> 1 * sapply(vars, function(x) dataset %in% x)
     [,1] [,2]
[1,]    1    0
[2,]    1    0
[3,]    0    1
[4,]    0    1
[5,]    0    1
[6,]    1    0
[7,]    1    0


回答3:

I see that John Colby has already taken the path I was going to suggest, but thought I would make it more explicit.

The dyadic function %in% returns a logical vector and multiplying by 1 coerced to "numeric" mode. This could also be done with:

matching_a <- as.numeric(dataset %in% x) # Or

matching_a <- 0 + (dataset %in% x)

You should also look at ?match on which the %in% function is based.



回答4:

I used a slight variation of John's approach above (and Max's solution) to generate a list of 'binary vectors' (for multiple matches) as follows:

library("plyr")

dataset<-c("a","b","c","d","x","y","z")
var1<-c("a","b","y","z")
var2<-c("c","d","x")
vars <- list(var1, var2)

binaryLst <- lapply(vars ,function(x){ifelse(dataset %in% x, 1, 0)})

output:

> binaryLst
[[1]]
[1] 1 1 0 0 0 1 1

[[2]]
[1] 0 0 1 1 1 0 0