I am very new at machine learning and am attempting the forest cover prediction competition on Kaggle, but I am getting hung up pretty early on. I get the following error when I run the code below.
Error in train.default(x, y, weights = w, ...) : final tuning parameters could not be determined In addition: There were 50 or more warnings (use warnings() to see the first 50)
# Load the libraries
library(ggplot2); library(caret); library(AppliedPredictiveModeling)
library(pROC)
library(Amelia)
set.seed(1234)
# Load the forest cover dataset from the csv file
rawdata <- read.csv("train.csv",stringsAsFactors = F)
#this data won't be used in model evaluation. It will only be used for the submission.
test <- read.csv("test.csv",stringsAsFactors = F)
########################
### DATA PREPARATION ###
########################
#create a training and test set for building and evaluating the model
samples <- createDataPartition(rawdata$Cover_Type, p = 0.5,list = FALSE)
data.train <- rawdata[samples, ]
data.test <- rawdata[-samples, ]
model1 <- train(as.factor(Cover_Type) ~ Elevation + Aspect + Slope + Horizontal_Distance_To_Hydrology,
data = data.train,
method = "rf", prox = "TRUE")
The following should work:
Its always better to specify the
tuneGrid
parameter which is a data frame with possible tuning values. Look at?randomForest
and?train
for more information.rf
has only one tuning parametermtry
, which controls the number of features selected for each tree.You can also run
modelLookup
to get a list of tuning parameters for each modelI too am doing Kaggle competitions and have been using the 'caret' package to help with choosing the 'best' model parameters. After getting many of these errors I looked into the scripting behind the scenes and discovered a call to a function called 'class2ind' which does not exist (at least anywhere I know). I finally found another function called 'class.ind' which is in the 'nnet' package. I decided to just try and create a local function called 'class2ind' and pop in the code from the 'class.ind' function. And low and behold it worked!