Use of switch() in R to replace vector values

2019-02-04 12:22发布

问题:

This should be pretty simple but even after checking all documentation and on-line examples I don't get it.

I'd like to use switch() to replace the values of a character vector.

A fake, extremely simple, reproducible example:

test<-c("He is", "She has", "He has", "She is")

Let's say I want to assign "1" to sentences including the verb "to be" and "2" to sentences including the verb "to have". The following DOES NOT work:

test<-switch(test,
                "He is"=1,
                "She is"=1,
                "He has"=2,
                "She has"=2)

Error message:

+ + + + Error in switch(test, `He is` = 1, `She is` = 1, `He has` = 2, `She has` = 2) : 
  EXPR must be a length 1 vector

I think EXPR is indeed a length 1 vector, so what's wrong?

I thought maybe R expected characters as replacements, but neither wrapping switch() into an "as.integer" nor the following work:

test<-switch(test,
                "He is"="1",
                "She is"="1",
                "He has"="2",
                "She has"="2")

Maybe it doesn't vectorize, and I should make a loop? Is that it? Would be disappointing, considering the strength of R is vectorization. Thanks in advance!

回答1:

The vectorised form of if is ifelse:

test <- ifelse(test == "He is", 1,
        ifelse(test == "She is", 1,
        ifelse(test == "He has", 2,
        2)))

or

test <- ifelse(test %in% c("He is", "She is"), 1, 2)

switch is basically a way of writing nested if-else tests. You should think of if and switch as control flow statements, not as data transformation operators. You use them to control the execution of an algorithm, eg to test for convergence or to choose which execution path to take. You wouldn't use them to directly manipulate data in most circumstances.



回答2:

Here is the correct way to vectorize a function, e.g. switch:

# Data vector:
test<-c("He is", "She has", "He has", "She is")

# Vectorized SWITCH:
foo <- Vectorize(function(a) {
  switch(as.character(a),
                      "He is" = 1,
                      "She is" = 1,
                      "He has" = 2,
                      2)

}, "a")

# Result:
foo(test)

  He is She has  He has  She is 
      1       2       2       1

I hope this helps.



回答3:

You coud try

test_out <- sapply(1:length(test), function(x) switch(test[x],
             "He is"=1,
             "She is"=1,
             "He has"=2,
             "She has"=2))

Or equivalently

test_out <- sapply(test, switch,
             "He is"=1,
             "She is"=1,
             "He has"=2,
             "She has"=2)


回答4:

"Vectorize" is based on the "mapply" function, whereas "ifelse" is a base function which should be already vectorized. So in terms of performance "Vectorize" might be slower. It is easy to vectorize an R function with the 'apply' family, but performance is usually an issue on large volumes. Better to use base functions optimized to work with vectors.



回答5:

I found this approach the most readable:

# input
test <-c("He is", "She has", "He has", "She is", "Unknown", "She is")

# mapping
map <- c(
  "He is" = 1, 
  "She has" = 2, 
  "He has" = 2, 
  "She is" = 1)

answer <- map[test]

# output
answer
He is She has  He has  She is    <NA>  She is 
    1       2       2       1      NA       1 

If test is numeric, must convert value to character to use this.



回答6:

While I usually prefer base R approaches, there is a package with a vectorized switch function.

library(broman)

switchv(c("horse", "fish", "cat", "bug"),
horse="fast",
cat="cute",
"what?")

Added based on comment to use OP data.

library(broman)

test<-c("He is", "She has", "He has", "She is")


test<-switchv(test,
                "He is"="1",
                "She is"="1",
                "He has"="2",
                "She has"="2")

test


回答7:

Here is a solution with recode() from car:

# Data vector:
x <- c("He is", "She has", "He has", "She is")

library("car")
recode(x, "'He is'=1; 'She is'=1; 'He has'=2; 'She has'=2") # or
recode(x, "c('He is', 'She is')=1; c('He has', 'She has')=2")