R: find vector in list of vectors

2019-02-24 09:33发布

问题:

i'm working with R and my goal is to check wether a given vector is in a list of unique vectors.

The list looks like

final_states <- list(c("x" = 5, "y" = 1),
                 c("x" = 5, "y" = 2),
                 c("x" = 5, "y" = 3),
                 c("x" = 5, "y" = 4),
                 c("x" = 5, "y" = 5),
                 c("x" = 3, "y" = 5))

Now I want to check wether a given state is in the list. For example:

state <- c("x" = 5, "y" = 3)

As you can see, the vector state is an element of the list final_states. My idea was to check it with %in% operator:

state %in% final_states

But I get this result:

[1] FALSE FALSE

Can anyone tell me, what is wrong?

Greets, lupi

回答1:

If you just want to determine if the vector is in the list, try

Position(function(x) identical(x, state), final_states, nomatch = 0) > 0
# [1] TRUE

Position() basically works like match(), but on a list. If you set nomatch = 0 and check for Position > 0, you'll get a logical result telling you whether state is in final_states



回答2:

"final_states" is a "list", so you could convert the "state" to list and then do

final_states %in% list(state)
#[1] FALSE FALSE  TRUE FALSE FALSE FALSE

or use mapply to check whether all the elements in "state" are present in each of the list elements of "final_states" (assuming that the lengths are the same for the vector and the list elements)

 f1 <- function(x,y) all(x==y)
 mapply(f1, final_states, list(state))
 #[1] FALSE FALSE  TRUE FALSE FALSE FALSE

Or rbind the list elements to a matrix and then check whether "state" and the "rows" of "m1" are the same.

m1 <- do.call(rbind, final_states)
!rowSums(m1!=state[col(m1)])
#[1] FALSE FALSE  TRUE FALSE FALSE FALSE

Or

 m1[,1]==state[1] & m1[,2]==state[2]
 #[1] FALSE FALSE  TRUE FALSE FALSE FALSE

Update

If you need to get a single TRUE/FALSE

  any(mapply(f1, final_states, list(state)))
  #[1] TRUE

Or

  any(final_states %in% list(state))
  #[1] TRUE

Or

 list(state) %in% final_states
 #[1] TRUE

Or use the "faster" fmatch from fastmatch

 library(fastmatch)
 fmatch(list(state), final_states) >0
 #[1] TRUE

Benchmarks

@Richard Sciven's base R function is very fast compared to other solutions except the one with fmatch

 set.seed(295)
 final_states <- replicate(1e6, sample(1:20, 20, replace=TRUE), 
          simplify=FALSE)
 state <- final_states[[151]]


 richard <- function() {Position(function(x) identical(x, state),
              final_states, nomatch = 0) > 0}
 Bonded <- function(){any( sapply(final_states, identical, state) )}
 akrun2 <- function() {fmatch(list(state), final_states) >0}
 akrun1 <- function() {f1 <- function(x,y) all(x==y)
            any(mapply(f1, final_states, list(state)))}

 library(microbenchmark)
 microbenchmark(richard(), Bonded(), akrun1(), akrun2(), 
        unit='relative', times=20L)
 #Unit: relative
 #    expr          min           lq        mean      median          uq
 # richard()     35.22635     29.47587    17.49164    15.66833    14.58235
 # Bonded() 109440.56885 101382.92450 55252.86141 47734.96467 44289.80309
 # akrun1() 167001.23864 138812.85016 75664.91378 61417.59871 62667.94867
 # akrun2()      1.00000      1.00000     1.00000     1.00000     1.00000
  #          max neval cld
  #     14.62328    20 a  
  #  46299.43325    20 b 
  #  63890.68133    20 c
  #      1.00000    20 a  


回答3:

Whenever i see a list object I first think of lapply. Seems to deliver the expected result with identical as the test and 'state' as the second argument:

> lapply(final_states, identical, state)
[[1]]
[1] FALSE

[[2]]
[1] FALSE

[[3]]
[1] TRUE

[[4]]
[1] FALSE

[[5]]
[1] FALSE

[[6]]
[1] FALSE

You get a possibly useful intermediate result with:

lapply(final_states, match, state)

... but it comes back as a series of position vectors where c(1,2) is the correct result.

If you want the result to come back as a vector , say for instance you want to use any, then use sapply instead of lapply.

> any( sapply(final_states[-3], identical, state) )
[1] FALSE
> any( sapply(final_states, identical, state) )
[1] TRUE


标签: r list vector find