I have the following data frame which is created below:
temp <- as.data.frame(with(uadm, table(prlo_state_code)))
I am looking to create 11 dummy variables. One for each of the top 10 and an 'other'. The top 10 can easily be found with:
#top10
temp <- temp[order(temp$Freq, decreasing=T),]
head(temp, n=10)
I know R is great, so I am assuming there is an easy to auto-create (and name) the dummy variables from the top 10 and collapse the rest into a final dummy called 'other.'
Thanks in advance for any help or insight.
You rarely need dummy variables -- R silently creates them for you.
If you just want to put all the classes that are not in the top 10 together, you can simply use
ifelse
and%in%
.If you absolutely need dummy variables, you can create them with
model.matrix
.R's regression functions will make up the necessary columns in the model.matrix when a factor-classed variable is entered in a formula.. It's all automatic. The default contrast is between the first factor level and each of the other levels, so-called "treatment constrasts". Other choices are possible.