Errors with createGrid for rf (randomForest) when

2019-08-10 20:21发布

问题:

When I try to crate a grid of parameters for training with caret I get various errors:

> my_grid <- createGrid("rf")
Error in if (p <= len) { : argument is of length zero

> my_grid <- createGrid("rf", 4)
Error in if (p <= len) { : argument is of length zero

> my_grid <- createGrid("rf", len=4)                                        
Error in if (p <= len) { : argument is of length zero

The documentation for createGrid says:

This function creates a data frame that contains a grid of
     complexity parameters specific methods.
Usage:
       createGrid(method, len = 3, data = NULL)
Arguments:
  method: a string specifying which classification model to use. See
          'train' for a full list.
     len: an integer specifying the number of points on the grid for
          each tuning parameter.
    data: the training data (only needed in the case where the 'method'
          is 'cforest', 'earth', 'bagEarth', 'fda', 'bagFDA', 'rpart',
          'svmRadial', 'pam', 'lars2', 'rf' or 'pls'). The outcome
          should be in a column called '.outcome'.

and gives the following examples which work correctly:

 createGrid("rda", 4)
 createGrid("lm")
 createGrid("nnet")

 ## data needed for SVM with RBF:
 ## Not run:

 tmp <- iris
 names(tmp)[5] <- ".outcome"
 head(tmp)
 createGrid("svmRadial", data = tmp, len = 4)
 ## End(Not run)

With this, what I am doing wrong?

Follow up question:

What is the connection between len as an argument to createGrid and tuneLength in the argument for train? Can len and tuneLength they be used together? What is their relationship?

Other helpful threads:

In case it helps, here is a thread describing how to use createGrid with train in caret: caret::train: specify model-generation-parameters

回答1:

The code you pulled from the examples works fine for me (and noting that it fixes the problem that existed when you posted on Rhelp):

tmp <- iris
 names(tmp)[5] <- ".outcome"
 head(tmp)
 createGrid("svmRadial", data = tmp, len = 4)
#-------
     .sigma   .C
1 0.7500934 0.25
2 0.7500934 0.50
3 0.7500934 1.00
4 0.7500934 2.00

Edit:

>  createGrid("rf", data = tmp, len = 4)
randomForest 4.6-7
Type rfNews() to see new features/changes/bug fixes.

Attaching package: ‘randomForest’

The following object(s) are masked from ‘package:Hmisc’:

    combine

note: only 3 unique complexity parameters in default grid. Truncating the grid to 3 .

  .mtry
1     2
2     3
3     4

I say again: What problem remains?