I want to reduce the number of variables when i train my model. I have a total of 784 features that I want to reduce to lets say 500. I can make a long string with the selected featuees with the Paste command collapsed with + to have a long string.
For example, lets say this is my vector
val <- "pixel40+pixel46+pixel48+pixel65+pixel66+pixel67"
then I would like to pass it to the train function like so
Rf_model <- train(label~val, data =training, method="rf", ntree=200, na.action=na.omit)
but I get the error
model.frame.default(form = label ~ val, data = training, na.action = na.omit)
Thanks!
Luis
You can do it like this:
val <- "pixel40+pixel46+pixel48+pixel65+pixel66+pixel67"
#use paste to paste the label to val
#and then use as.formula to convert to formula
form <- as.formula(paste('label ~', val))
#> form
#label ~ pixel40 + pixel46 + pixel48 + pixel65 + pixel66 + pixel67
Rf_model <- train(form, data =training, method="rf", ntree=200, na.action=na.omit)
Also, in this case using a string to create a formula should be fine since this is straightforward, but for more complex formulas it might prove error prone. In such cases you can explore stats::update
or the Formula
package.
Or you could alternatively use update
(although I prefer the previous way):
#> update(label ~ 1, paste('~', val) )
#label ~ pixel40 + pixel46 + pixel48 + pixel65 + pixel66 + pixel67