Let's say I'm using the Sonar
data and I'd like to make a hold-out validation in R. I partitioned the data using the createFolds
from caret
package as folds <- createFolds(mydata$Class, k=5)
.
I would like then to use exactly the fold mydata[i]
as test data and train a classifier using mydata[-i]
as train data.
My first thought was to use the train
function, but I couldn't find any support for hold-out validation. Am I missing something here?
Also, I'd like to be able to use exactly the pre-defined folds as parameter, instead of letting the function partition the data. Does anyone have any thoughts?
Thanks in advance
I think that maybe you want to use 1/5th of the data as a test set and train using the other 4/5ths?
If that is the case, you should used createDataPartition
first and let train
do the rest. For example:
> library(caret)
> library(mlbench)
> data(Sonar)
>
> set.seed(1)
> in_train <- createDataPartition(Sonar$Class, p = 4/5, list = FALSE)
>
> training <- Sonar[ in_train,]
> testing <- Sonar[-in_train,]
>
> nrow(Sonar)
[1] 208
> nrow(training)
[1] 167
> nrow(testing)
[1] 41
>
> lda_fit <- train(Class ~ ., data = training, method = "lda")
> lda_fit
Linear Discriminant Analysis
167 samples
60 predictors
2 classes: 'M', 'R'
No pre-processing
Resampling: Bootstrapped (25 reps)
Summary of sample sizes: 167, 167, 167, 167, 167, 167, ...
Resampling results
Accuracy Kappa Accuracy SD Kappa SD
0.71 0.416 0.0532 0.108
Max