factor(0) when using predict for SVM in R

2019-09-16 01:24发布

问题:

I have a data frame trainData which contains 198 rows and looks like

            Matchup Win HomeID AwayID A_TWPCT A_WST6 A_SEED B_TWPCT B_WST6 B_SEED
1  2010_1115_1457   1   1115   1457   0.531      5     16   0.567      4     16
2  2010_1124_1358   1   1124   1358   0.774      5      3    0.75      5     14
...

The testData is similar.

In order to use SVM, I have to change the response variable Win to a factor. I tried the below:

trainDataSVM <- data.frame(Win=as.factor(trainData$Win), A_WST6=trainData$A_WST6, A_SEED=trainData$A_SEED, B_WST6=trainData$B_WST6, B_SEED= trainData$B_SEED,
                      Matchup=trainData$Matchup, HomeID=trainData$HomeID, AwayID=trainData$AwayID)

I then want to a SVM and predict the probabilities, so I tried the below

svmfit =svm (Win ~ A_WST6 + A_SEED + B_WST6 + B_SEED , data = trainDataSVM , kernel ="linear", cost =10,scale =FALSE )
#use CV with a range of cost values
set.seed (1)
tune.out = tune(svm, Win ~ A_WST6 + A_SEED + B_WST6 + B_SEED, data=trainDataSVM , kernel ="linear",ranges =list (cost=c(0.001 , 0.01 , 0.1, 1 ,5 ,10 ,100) ))
bestmod =tune.out$best.model

testDataSVM <- data.frame(Win=as.factor(testData$Win), A_WST6=testData$A_WST6, A_SEED=testData$A_SEED, B_WST6=testData$B_WST6, B_SEED= testData$B_SEED,
                       Matchup=testData$Matchup, HomeID=testData$HomeID, AwayID=testData$AwayID)

predictions_SVM <- predict(bestmod, testDataSVM, type = "response")

However, when I try to print out predictions_SVM, I get the message

factor(0)
Levels: 0 1

instead of a column of probability values. What is going on?

回答1:

I haven't used this much myself, but I know that the SVM algorithm itself does not produce class probabilities, only the response function (distance from hyperplane). If you look at the documentation for svm function, the argument "probability" - "logical indicating whether the model should allow for probability predictions" - is FALSE by default and you did not set it equal to TRUE. Documentation for predict.svm says similarly, argument "probability" is a "Logical indicating whether class probabilities should be computed and returned. Only possible if the model was fitted with the probability option enabled." Hope that's helpful.