Dummy variables and preProcess

2020-02-28 07:27发布

站内文章 / 后端开发

19 0

一夜七次

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a data frame with some dummy variables that I want to use as training set for glmnet.

Since I'm using glmnet I want to center and scale the features using the preProcess option in the caret train function. I don't want that this transformation is applied also to the dummy variables.

Is there a way to prevent the transformation of these variables?

回答1:

There's not (currently) a way to do this besides writing a custom model to do so (see the example with PLS and RF near the end).

I'm working on a method to specify which variables get which pre-processing method. However, with dummy variables, this is tough since you might need to specific the names of a lot of predictors whose columns are not in the current dat set. The idea is to be able to use wildcards (e.g. Species* to capture Speciesversicolor and Speciesvirginica) but the code isn't quite there yet.

Max