I have a data.frame
consisting of numeric and factor variables as seen below.
testFrame <- data.frame(First=sample(1:10, 20, replace=T),
Second=sample(1:20, 20, replace=T), Third=sample(1:10, 20, replace=T),
Fourth=rep(c("Alice","Bob","Charlie","David"), 5),
Fifth=rep(c("Edward","Frank","Georgia","Hank","Isaac"),4))
I want to build out a matrix
that assigns dummy variables to the factor and leaves the numeric variables alone.
model.matrix(~ First + Second + Third + Fourth + Fifth, data=testFrame)
As expected when running lm
this leaves out one level of each factor as the reference level. However, I want to build out a matrix
with a dummy/indicator variable for every level of all the factors. I am building this matrix for glmnet
so I am not worried about multicollinearity.
Is there a way to have model.matrix
create the dummy for every level of the factor?
dummyVars
fromcaret
could also be used. http://caret.r-forge.r-project.org/preprocess.htmlUsing the R package 'CatEncoders'