parRF from the caret R package is not working for me with more than one core, which is quite ironic, given the par in parRF stands for parallel. I'm on a windows machine, if that is a relevant piece of information. I checked that I'm using the latest an greatest regarding caret and doParallel.
I made a minimal example and and give the results below. Any ideas?
Source code
trCtrl <- trainControl(
method = "repeatedcv"
, number = 2
, repeats = 5
, allowParallel = TRUE
train(form = Species~., data=iris, trControl = trCtrl, method="parRF")
> library(caret)
> library(doParallel)
> trCtrl <- trainControl(
+ method = "repeatedcv"
+ , number = 2
+ , repeats = 5
+ , allowParallel = TRUE
+ )
> registerDoParallel(1)
> train(form = Species~., data=iris, trControl = trCtrl, method="parRF")
Parallel Random Forest
150 samples
4 predictors
3 classes: 'setosa', 'versicolor', 'virginica'
... some more model output, works fine!
> closeAllConnections()
> registerDoParallel(2)
> train(form = Species~., data=iris, trControl = trCtrl, method="parRF")
Error in train.default(x, y, weights = w, ...) :
final tuning parameters could not be determined
In addition: Warning messages:
1: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
There were missing values in resampled performance measures.
2: In train.default(x, y, weights = w, ...) :
missing values found in aggregated results
> closeAllConnections()
Session Info
> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] doParallel_1.0.8 iterators_1.0.7 foreach_1.4.2 e1071_1.6-3 randomForest_4.6-7 caret_6.0-30 ggplot2_1.0.0
[8] lattice_0.20-29
loaded via a namespace (and not attached):
[1] BradleyTerry2_1.0-4 brglm_0.5-9 car_2.0-20 class_7.3-10 codetools_0.2-8 colorspace_1.2-4
[7] compiler_3.1.0 digest_0.6.4 gnm_1.0-7 grid_3.1.0 gtable_0.1.2 gtools_3.4.1
[13] lme4_1.1-6 MASS_7.3-31 Matrix_1.1-3 minqa_1.2.3 munsell_0.4.2 nlme_3.1-117
[19] nnet_7.3-8 plyr_1.8.1 proto_0.3-10 qvcalc_0.8-8 Rcpp_0.11.2 RcppEigen_0.
[25] relimp_1.0-3 reshape2_1.4 scales_0.2.4 splines_3.1.0 stringr_0.6.2 tcltk_3.1.0
[31] tools_3.1.0
- Tried it with 3.1.1 (same packages versions), same result.
- Tried it with 3.0.2 and some older Version of caret a doParallel, it worked (see session Info)
Session Info 2:
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C LC_TIME=German_Germany.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] e1071_1.6-1 class_7.3-9 randomForest_4.6-7 doParallel_1.0.6 iterators_1.0.6
[6] caret_5.17-7 reshape2_1.2.2 plyr_1.8 lattice_0.20-24 foreach_1.4.1
[11] cluster_1.14.4
loaded via a namespace (and not attached):
[1] codetools_0.2-8 compiler_3.0.2 grid_3.0.2 stringr_0.6.2 tools_3.0.2