How to pass a character vector in the train functi

2020-03-31 03:16发布

问题:

I want to reduce the number of variables when i train my model. I have a total of 784 features that I want to reduce to lets say 500. I can make a long string with the selected featuees with the Paste command collapsed with + to have a long string. For example, lets say this is my vector

val <- "pixel40+pixel46+pixel48+pixel65+pixel66+pixel67"

then I would like to pass it to the train function like so

Rf_model <- train(label~val, data =training, method="rf", ntree=200, na.action=na.omit)

but I get the error

model.frame.default(form = label ~ val, data = training, na.action = na.omit)

Thanks! Luis

回答1:

You can do it like this:

val <- "pixel40+pixel46+pixel48+pixel65+pixel66+pixel67"

#use paste to paste the label to val
#and then use as.formula to convert to formula
form <- as.formula(paste('label ~', val))
#> form
#label ~ pixel40 + pixel46 + pixel48 + pixel65 + pixel66 + pixel67 

Rf_model <- train(form, data =training, method="rf", ntree=200, na.action=na.omit)

Also, in this case using a string to create a formula should be fine since this is straightforward, but for more complex formulas it might prove error prone. In such cases you can explore stats::update or the Formula package.


Or you could alternatively use update (although I prefer the previous way):

#> update(label ~ 1,  paste('~', val) )
#label ~ pixel40 + pixel46 + pixel48 + pixel65 + pixel66 + pixel67