Changing binary variables to Yes/No

2019-07-31 23:57发布

问题:

So I have a data frame that I'd like to analyze. The problem is that instead of Yes/No, there are a bunch of 1s and 0s (1 being Yes, 0 being No) in the data frame. How do I modify the data frame to make it so instead of the 1s and 0s there are Yes and No so I can use logistic regression? I am sure there is a simple fix for this that I am not thinking of

Thanks!

回答1:

Use ?factor.

See this example

> set.seed(1)
> dummyVariable <- sample(c(0,1), 10, TRUE)  # bunch of 0 and 1
> newVariable <- factor(dummyVariable, levels=c(0,1), labels=c("No", "Yes"))
> newVariable  # this is now a dummy variable ready for regression analysis
 [1] No  No  Yes Yes No  Yes Yes Yes Yes No 
Levels: No Yes


回答2:

You can also just use your values as indices of the c('no','yes') vector, adding 1 as your values start at 0.

This will be easy to generalize in case of more than two values, which wouldn't work so well with ifelse:

c('no','yes')[df$col+1]

or

factor(c('no','yes')[df$col+1],c('no','yes'))


回答3:

Another way to get a factor out of this:

factor(ifelse(dummyVariable, 'Yes', 'No'))


回答4:

Try using gsub.

dummyVariable<-gsub(0,"No",dummyVariable)
dummyVariable<-gsub(1,"Yes",dummyVariable)
dummyVariable
# [1] "No"  "No"  "Yes" "Yes" "No"  "Yes" "Yes" "Yes" "Yes" "No"