As I'm trying the rfe example from the "caret" package taken from here, I kept on receiving this error
Error in rfe.default(d[1:2901, ], c(1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, :
there should be the same number of samples in x and y
This question has been asked but its solution doesn't apply in this case.
Here's the code:
set.seed(7)
# load the library
library(mlbench)
library(caret)
# load the data
d <- read.table("d.dat")
# define the control using a random forest selection function
control <- rfeControl(functions=rfFuncs, method="cv", number=10)
# run the RFE algorithm
results <- rfe(d[1:2901, ], c(1,1,1,1, 1, 1,2,2,2, 3 ,3,3,4, 4, 4), sizes=c(1:2901), rfeControl=control)
# summarize the results
print(results)
The dataset is a data frame of 2901 rows (features) and 15 columns. The vector c(1,1,1,1,1,1,2,2,2,3,3,3,4,4,4) is the predictor for the features.
What parameter am I setting wrong?
We don't know your data, but this works with simulated data:
There is a convention that rows are observations and columns are features. The way you provided x argument to
rfe
means you have 2901 observations, which produces a mismatch with 15 outcomes. Use transpose functiont
on your data (if it has 15 columns of course).The
y = c(1,1,1...)
vector shouldn't be called predictor. It is dependent variable or outcome. First argument is a data.frame of predictor variables.Your problem is that you dont have the nr of rows of x as same length of the vector y