I am using SKLearn to run SVC on my data.
from sklearn import svm
svc = svm.SVC(kernel='linear', C=C).fit(X, y)
I want to know how I can get the distance of each data point in X from the decision boundary?
I am using SKLearn to run SVC on my data.
from sklearn import svm
svc = svm.SVC(kernel='linear', C=C).fit(X, y)
I want to know how I can get the distance of each data point in X from the decision boundary?
It happens to be that I am doing the homework 1 of a course named Machine Learning Techniques. And there happens to be a problem about point's distance to hyperplane even for RBF kernel.
First we know that SVM is to find an "optimal" w for a hyperplane wx + b = 0.
And the fact is that
w = \sum_{i} \alpha_i \phi(x_i)
where those x are so called support vectors and those alpha are coefficient of them. Note that there is a phi() outside the x; it is the transform function that transform x to some high dimension space (for RBF, it is infinite dimension). And we know that
so we can compute
then we can get w. So, the distance you want should be
where w_norm the the norm calculated above.
(StackOverflow doesn't allow me post more than 2 links so render the latex yourself bah.)
For linear kernel, the decision boundary is y = w * x + b, the distance from point x to the decision boundary is y/||w||.
For non-linear kernels, there is no way to get the absolute distance. But you can still use the result of
decision_funcion
as relative distance.