R caret nnet package in Multicore

2020-02-26 09:19发布

问题:

Can we train a neural network model in parallel using multicores by leveraging foreach package, nnet and caret packages ?

I only see randomforest implementation in parallel. Is neural network possible.

I am especially interested in the caret's train function which can do a grid search for optimal hidden layers and decay size. This take a long time to run on a single core.

Any help is appreciated.

回答1:

Are looking to implement the algorithm or your resampling in parallel? If you're looking for the later all you have to do simply register the number of cores you would like to use via registerDoMC() and it will run those in parallel. Ex:

> library(caret)
> library(doMC)
> 
> registerDoMC(4)
> tc <- trainControl(method="boot",number=25)
> train(Species~.,data=iris,method="nnet",trControl=tc)
# weights:  43
initial  value 596.751921 
iter  10 value 61.068365
iter  20 value 16.320051
iter  30 value 9.581306
iter  40 value 8.639828
iter  50 value 8.492001
iter  60 value 8.364661
iter  70 value 8.264618
iter  80 value 8.082598
iter  90 value 5.911050
iter 100 value 1.179339
final  value 1.179339 
stopped after 100 iterations
450 samples
  4 predictors
  3 classes: 'setosa', 'versicolor', 'virginica' 

No pre-processing
Resampling: Bootstrap (25 reps) 

Summary of sample sizes: 450, 450, 450, 450, 450, 450, ... 

Resampling results across tuning parameters:

  size  decay  Accuracy  Kappa  Accuracy SD  Kappa SD
  1     0      0.755     0.64   0.251        0.366   
  1     1e-04  0.834     0.758  0.275        0.401   
  1     0.1    0.964     0.946  0.0142       0.0214  
  3     0      0.961     0.941  0.0902       0.135   
  3     1e-04  0.972     0.958  0.0714       0.104   
  3     0.1    0.977     0.966  0.0108       0.0163  
  5     0      0.973     0.96   0.0579       0.0888  
  5     1e-04  0.987     0.98   0.00856      0.0129  
  5     0.1    0.978     0.966  0.0112       0.0168  

Accuracy was used to select the optimal model using  the largest value.
The final values used for the model were size = 5 and decay = 1e-04.

Screenshot of 4 cores running:



回答2:

doMC doesn't support R 3.2. you can use doParallel

library(doParallel);
cl <- makeCluster(detectCores())
registerDoParallel(cl)
tc <- trainControl(method="boot",number=25)
train(Species~.,data=iris,method="nnet",trControl=tc)
stopCluster(cl)