I have a data frame, which contains the "date variable". (the test data and code is available here)
However, I use "function = caretFunc". It shows error message.
Error in { : task 1 failed - "missing value where TRUE/FALSE needed"
In addition: Warning messages:
1: In FUN(newX[, i], ...) : NAs introduced by coercion
2: In FUN(newX[, i], ...) : NAs introduced by coercion
3: In FUN(newX[, i], ...) : NAs introduced by coercion
4: In FUN(newX[, i], ...) : NAs introduced by coercion
5: In FUN(newX[, i], ...) : NAs introduced by coercion
6: In FUN(newX[, i], ...) : NAs introduced by coercion
7: In FUN(newX[, i], ...) : NAs introduced by coercion
8: In FUN(newX[, i], ...) : NAs introduced by coercion
9: In FUN(newX[, i], ...) : NAs introduced by coercion
10: In FUN(newX[, i], ...) : NAs introduced by coercion
What can I do?
Code to reproduce the error:
library(mlbench)
library(caret)
library(maps)
library(rgdal)
library(raster)
library(sp)
library(spdep)
library(GWmodel)
library(e1071)
library(plyr)
library(kernlab)
library(zoo)
mydata <- read.csv("Realestatedata_all_delete_date.csv", header=TRUE)
mydata$estate_TransDate <- as.Date(paste(mydata$estate_TransDate,1,sep="-"),format="%Y-%m-%d")
mydata$estate_HouseDate <- as.Date(mydata$estate_HouseDate,format="%Y-%m-%d")
rfectrl <- rfeControl(functions=caretFuncs, method="cv",number=10,verbose=TRUE,returnResamp = "final")
results <- rfe(mydata[,1:48],mydata[,49],sizes = c(1:48),rfeControl=rfectrl,method = "svmRadial")
print(results)
predictors(results)
plot(results, type=c("g", "o"))
You have
NAs
(missing values) inmydata
in the following input variables (which you feed to the classifier):gives:
In addition, it looks like your date variables (transaction date and house date) seem to be converted to
NAs
insiderfe(..)
.The SVM regressor seems not to be able to deal with
NAs
as is.I would convert the dates to something like 'years since a given reference':
And also remove those entries with any
NA
in any of the columns you use as input to the regressor:then run rfe with:
In my case, this would have taken a long time to complete, so I tested it only on one percent of the data using:
The question remains what to do if your are given data to predict the price where some of input variables as
NA
.