This question already has an answer here:
-
Replace values in a vector based on another vector
2 answers
I want a function whose input is a vector of 1s, 2s, and 3s which sends 1 to .2, 2 to .4 and 3 to .5. (The output should be a vector of equal length.) How do I accomplish this?
For example, if
myVector<-c(1,2,3,2,3,3,1)
Then the function
mapVector(myVector)
should return a vector like (.2,.4,.5,.4,.5,.5,.2)
A couple of options, all using:
myVector<-c(1,2,3,2,3,3,1)
Factor
newvals <- c(.2,.4,.5)
newvals[as.factor(myVector)]
#[1] 0.2 0.4 0.5 0.4 0.5 0.5 0.2
Named vector
newvals <- c(`1`=.2,`2`=.4,`3`=.5)
newvals
# 1 2 3
#0.2 0.4 0.5
newvals[as.character(myVector)]
# 1 2 3 2 3 3 1
#0.2 0.4 0.5 0.4 0.5 0.5 0.2
Lookup table
mapdf <- data.frame(old=c(1,2,3),new=c(.2,.4,.5))
mapdf$new[match(myVector,mapdf$old)]
#[1] 0.2 0.4 0.5 0.4 0.5 0.5 0.2
Benchmarks to quantify @Joe 's comment below and address @Ananda's comment as well.
myVector <- c(1,2,3,2,3,3,1)
# setup for the benchmarking
test <- sample(myVector,1e6,replace=TRUE)
newvals <- c(.2,.4,.5)
newvalsvec <- c(`1`=.2,`2`=.4,`3`=.5)
mapdf <- data.frame(old=c(1,2,3),new=c(.2,.4,.5))
microbenchmark(
newvals[as.factor(test)],
newvalsvec[as.character(test)],
mapdf$new[match(test,mapdf$old)],
newvals[test],
times=10L
)
#Unit: milliseconds
# expr min lq median uq max
#factor 1863.40146 1876.04197 1890.99147 1913.13046 2014.23609
#namedvector 1809.26883 1812.76272 1837.18852 1851.42954 1858.44996
#lookup 38.48697 38.83405 39.90146 69.65140 71.75051
#newvals[test] 34.07380 34.55885 50.61287 65.69495 66.08699
install.packages("hash")
library(hash)
h<-hash(1:3, c(.2,.4,.5))
myVector<-c(1,2,3,2,3,3,1)
sapply(myVector,function(x){return(h[[as.character(x)]])})