I am having a hard time troubleshooting the error message below. I am trying to do a random forest model on a titanic
data set. Is there a way to get around this error? Is there a code to check the levels in the tree?
Error in predict.randomForest(my_rf_model, test1) : Type of predictors in new data
do not match that of the training data.
This is probably occurring because one of the predictor variables in test1
is a factor variable that has a value not present in the original data set. For example, if titanic
has a column called group
that can have values A
or B
, but test1$group
can have a value of C
, then you would get that error.
For example:
data(iris)
iris$group = factor(sample(c("A","B"), nrow(iris), replace=TRUE))
rf <- randomForest(Species ~ ., data=iris)
newdat = iris
newdat$group = "C"
predict(rf, newdata=newdat)
Error in predict.randomForest(rf, newdata = newdat) : Type of
predictors in new data do not match that of the training data.