I generated a random data set like this:
set.seed(1234)
df <- data.frame(replicate(10, runif(100, 0, 1)))
df$Class <- sample(c(-1,1), 100, replace=T)
df$Class <- as.factor(df$Class)
It has binary classes, 100 samples and 10 features.
I tried using svm in R (e1071 package):
library(e1071)
set.seed(1234)
model <- svm(Class~.,data=df,kernel="radial",cost=1.0,tolerance=0.001,epsilon=1.0E-12,scale=TRUE,cross=10)
res <- predict(model, df[,-11])
table(pred=res, true=df[,11])
summary(res)
summary(df$Class)
Then it gave me a result like:
true
pred -1 1
-1 49 13
1 6 32
Originally in the data, the sample numbers of the two classes are:
-1 1
55 45
and the model prediction gives:
-1 1
62 38
However when I output this data into .arff file and run with WEKA SMO, and tried to set the same parameters as discribed in this question:
weka.classifiers.functions.SMO -C 1.0 -L 0.001 -P 1.0E-12 -N 0 -V 10 -W 1234 -K "weka.classifiers.functions.supportVector.RBFKernel -G 0.1 -C 250007"
All 100 predictions of WEKA is class -1, i.e. no sample is predicted to be class 1.
The two results seem extremely different.
I was wondering, if there're other parameters so different between this two methods. Or it's just totally different implementation? If it's the latter, can you please explain to me how exactly they works? I know the gist of how svm works, I just cannot imagine why they perform so differently, and am hesitate to decide which one to use.
Thank you very much.
Similar question here.