I have 30 factor levels of a predictor in my training data. I again have 30 factor levels of the same predictor in my test data but some levels are different. And randomForest does not predict unless the levels are same exactly. It shows error. Says, Error in predict.randomForest(model,test) New factor levels not present in the training data
相关问题
- R - Quantstart: Testing Strategy on Multiple Equit
- Using predict with svyglm
- Reshape matrix by rows
- Extract P-Values from Dunnett Test into a Table by
- split data frame into two by column value [duplica
相关文章
- How to convert summary output to a data frame?
- How to plot smoother curves in R
- Paste all possible diagonals of an n*n matrix or d
- ess-rdired: I get this error “no ESS process is as
- How to use doMC under Windows or alternative paral
- dyLimit for limited time in Dygraphs
- Saving state of Shiny app to be restored later
- How to insert pictures into each individual bar in
Use this to make the levels match (here test and train refer to columns in the testing and training datasets)
One workaround I've found is to first convert the factor variables in your train and test sets into characters
Then add a column to each with a flag for test/train, i.e.
Then rbind them
Then convert back to a factor
This will ensure that both the test and train sets have the same levels. Then you can split back off:
and you can drop/NULL out the
isTest
column from each. Then you'll have sets with identical levels you can train and test on. There might be a more elegant solution, but this has worked for me in the past and you can write it into a little function if you need to repeat it often.simple solution to this would be rbind your test data with training data ,do prediction and subset the rbind data from predictions .Tested method