I am using R-studio and am using kaggle's forest cover data and keep getting an error when trying to use the knn3 function in caret. here is my code:
library(caret)
train <- read.csv("C:/data/forest_cover/train.csv", header=T)
trainingRows <- createDataPartition(train$Cover_Type, p=0.8, list=F)
head(trainingRows)
train_train <- train[trainingRows,]
train_test <- train[-trainingRows,]
knnfit <- knn3(train_train[,-56], train_train$Cover_Type)
This last line gives me this in the console:
Error in knn3.matrix(x, y = y, k = k, ...) : y must be a factor
As the error message states,
y
must be a factor (here,y
is the name of the second parameter to the function). In R, a factor variable is used to represent categorical data. You can turny
into a factor withfactor(y)
but it will just have the levels1:7
for your data. If you want to give more meaningful values to your factor, tryThat will make your function happier and give you more useful labels in the results