I have a data frame trainData
which contains 198 rows and looks like
Matchup Win HomeID AwayID A_TWPCT A_WST6 A_SEED B_TWPCT B_WST6 B_SEED
1 2010_1115_1457 1 1115 1457 0.531 5 16 0.567 4 16
2 2010_1124_1358 1 1124 1358 0.774 5 3 0.75 5 14
...
The testData
is similar.
In order to use SVM, I have to change the response variable Win
to a factor
. I tried the below:
trainDataSVM <- data.frame(Win=as.factor(trainData$Win), A_WST6=trainData$A_WST6, A_SEED=trainData$A_SEED, B_WST6=trainData$B_WST6, B_SEED= trainData$B_SEED,
Matchup=trainData$Matchup, HomeID=trainData$HomeID, AwayID=trainData$AwayID)
I then want to a SVM and predict the probabilities, so I tried the below
svmfit =svm (Win ~ A_WST6 + A_SEED + B_WST6 + B_SEED , data = trainDataSVM , kernel ="linear", cost =10,scale =FALSE )
#use CV with a range of cost values
set.seed (1)
tune.out = tune(svm, Win ~ A_WST6 + A_SEED + B_WST6 + B_SEED, data=trainDataSVM , kernel ="linear",ranges =list (cost=c(0.001 , 0.01 , 0.1, 1 ,5 ,10 ,100) ))
bestmod =tune.out$best.model
testDataSVM <- data.frame(Win=as.factor(testData$Win), A_WST6=testData$A_WST6, A_SEED=testData$A_SEED, B_WST6=testData$B_WST6, B_SEED= testData$B_SEED,
Matchup=testData$Matchup, HomeID=testData$HomeID, AwayID=testData$AwayID)
predictions_SVM <- predict(bestmod, testDataSVM, type = "response")
However, when I try to print out predictions_SVM
, I get the message
factor(0)
Levels: 0 1
instead of a column of probability values. What is going on?
I haven't used this much myself, but I know that the SVM algorithm itself does not produce class probabilities, only the response function (distance from hyperplane). If you look at the documentation for svm function, the argument "probability" - "logical indicating whether the model should allow for probability predictions" - is FALSE by default and you did not set it equal to TRUE. Documentation for predict.svm says similarly, argument "probability" is a "Logical indicating whether class probabilities should be computed and returned. Only possible if the model was fitted with the probability option enabled." Hope that's helpful.