i'm working with R and my goal is to check wether a given vector is in a list of unique vectors.
The list looks like
final_states <- list(c("x" = 5, "y" = 1),
c("x" = 5, "y" = 2),
c("x" = 5, "y" = 3),
c("x" = 5, "y" = 4),
c("x" = 5, "y" = 5),
c("x" = 3, "y" = 5))
Now I want to check wether a given state is in the list. For example:
state <- c("x" = 5, "y" = 3)
As you can see, the vector state is an element of the list final_states. My idea was to check it with %in% operator:
state %in% final_states
But I get this result:
[1] FALSE FALSE
Can anyone tell me, what is wrong?
Greets,
lupi
If you just want to determine if the vector is in the list, try
Position(function(x) identical(x, state), final_states, nomatch = 0) > 0
# [1] TRUE
Position()
basically works like match()
, but on a list. If you set nomatch = 0
and check for Position > 0
, you'll get a logical result telling you whether state
is in final_states
"final_states" is a "list", so you could convert the "state" to list
and then do
final_states %in% list(state)
#[1] FALSE FALSE TRUE FALSE FALSE FALSE
or use mapply
to check whether all the elements in "state" are present in each of the list elements of "final_states" (assuming that the lengths are the same for the vector and the list elements)
f1 <- function(x,y) all(x==y)
mapply(f1, final_states, list(state))
#[1] FALSE FALSE TRUE FALSE FALSE FALSE
Or rbind
the list elements to a matrix and then check whether "state" and the "rows" of "m1" are the same.
m1 <- do.call(rbind, final_states)
!rowSums(m1!=state[col(m1)])
#[1] FALSE FALSE TRUE FALSE FALSE FALSE
Or
m1[,1]==state[1] & m1[,2]==state[2]
#[1] FALSE FALSE TRUE FALSE FALSE FALSE
Update
If you need to get a single TRUE/FALSE
any(mapply(f1, final_states, list(state)))
#[1] TRUE
Or
any(final_states %in% list(state))
#[1] TRUE
Or
list(state) %in% final_states
#[1] TRUE
Or use the "faster" fmatch
from fastmatch
library(fastmatch)
fmatch(list(state), final_states) >0
#[1] TRUE
Benchmarks
@Richard Sciven's base R
function is very fast compared to other solutions except the one with fmatch
set.seed(295)
final_states <- replicate(1e6, sample(1:20, 20, replace=TRUE),
simplify=FALSE)
state <- final_states[[151]]
richard <- function() {Position(function(x) identical(x, state),
final_states, nomatch = 0) > 0}
Bonded <- function(){any( sapply(final_states, identical, state) )}
akrun2 <- function() {fmatch(list(state), final_states) >0}
akrun1 <- function() {f1 <- function(x,y) all(x==y)
any(mapply(f1, final_states, list(state)))}
library(microbenchmark)
microbenchmark(richard(), Bonded(), akrun1(), akrun2(),
unit='relative', times=20L)
#Unit: relative
# expr min lq mean median uq
# richard() 35.22635 29.47587 17.49164 15.66833 14.58235
# Bonded() 109440.56885 101382.92450 55252.86141 47734.96467 44289.80309
# akrun1() 167001.23864 138812.85016 75664.91378 61417.59871 62667.94867
# akrun2() 1.00000 1.00000 1.00000 1.00000 1.00000
# max neval cld
# 14.62328 20 a
# 46299.43325 20 b
# 63890.68133 20 c
# 1.00000 20 a
Whenever i see a list object I first think of lapply
. Seems to deliver the expected result with identical
as the test and 'state' as the second argument:
> lapply(final_states, identical, state)
[[1]]
[1] FALSE
[[2]]
[1] FALSE
[[3]]
[1] TRUE
[[4]]
[1] FALSE
[[5]]
[1] FALSE
[[6]]
[1] FALSE
You get a possibly useful intermediate result with:
lapply(final_states, match, state)
... but it comes back as a series of position vectors where c(1,2) is the correct result.
If you want the result to come back as a vector , say for instance you want to use any
, then use sapply
instead of lapply
.
> any( sapply(final_states[-3], identical, state) )
[1] FALSE
> any( sapply(final_states, identical, state) )
[1] TRUE