可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a very simple 1D classification problem: a list of values [0, 0.5, 2] and their associated classes [0, 1, 2]. I would like to get the classification boundaries between those classes.

Adapting the iris example (for visualization purposes), getting rid of the non-linear models:

X = np.array([[x, 1] for x in [0, 0.5, 2]]) 
Y = np.array([1, 0, 2])

C = 1.0  # SVM regularization parameter
svc = svm.SVC(kernel='linear', C=C).fit(X, Y)
lin_svc = svm.LinearSVC(C=C).fit(X, Y)

Gives the following result:

LinearSVC is returning junk (why?), but the SVC with linear kernel is working okay. So I would like to get the boundaries values, that you can graphically guess: ~0.25 and ~1.25.

That's where I'm lost: svc.coef_ returns

array([[ 0.5       ,  0.        ],
       [-1.33333333,  0.        ],
       [-1.        ,  0.        ]])

while svc.intercept_ returns array([-0.125 , 1.66666667, 1. ]). This is not explicit.

I must be missing something silly, how to obtain those values? They seem obvious to compute, that would be ridiculous to iterate over the x-axis to find the boundary...

回答1:

I had the same question and eventually found the solution in the sklearn documentation.

Given the weights W=svc.coef_[0] and the intercept I=svc.intercept_ , the decision boundary is the line

y = a*x - b

with

a = -W[0]/W[1]
b = I[0]/W[1]

回答2:

Exact boundary calculated from coef_ and intercept_

I think this is a great question and haven't been able to find a general answer to it anywhere in the documentation. This site really needs Latex, but anyway, I'll try to do my best without...

In general, a hyperplane is defined by its unit normal and an offset from the origin. So we hope to find some decision function of the form: x dot n + d > 0 (where the > may of course be replaced with >=).

In the case of the SVM Margins Example, we can manipulate the equation they start with to clarify its conceptual significance. First, let's establish the notational convenience of writing coef to represent coef_[0] and intercept to represent intercept_[0], since these arrays only have 1 value. Then some simple substitution yields the equation:

y + coef[0]*x/coef[1] + intercept/coef[1] = 0

Multiplying through by coef[1], we obtain

coef[1]*y + coef[0]*x + intercept = 0

And so we see that the coefficients and intercept function roughly as their names would imply. Applying one quick generalization of notation should make the answer clear - we will replace x and y with a single vector x.

coef[0]*x[0] + coef[1]*x[1] + intercept = 0

In general, the coef_ and intercept_ members of the svm classifier will have dimension matching the data set it was trained on, so we can extrapolate this equation to data of arbitrary dimension. And to avoid leading anyone astray, here is the final generalized decision boundary using the original variable names from the svm:

coef_[0][0]*x[0] + coef_[0][1]*x[1] + coef_[0][2]*x[2] + ... + coef_[0][n-1]*x[n-1] + intercept_[0] = 0

where the dimension of the data is n.

Or more tersely:

sum(coef_[0][i]*x[i]) + intercept_[0] = 0

where i sums over the range of the dimension of the input data.

回答3:

Get decision line from SVM, demo 1

import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.datasets import make_blobs
# we create 40 separable points
X, y = make_blobs(n_samples=40, centers=2, random_state=6)
# fit the model, don't regularize for illustration purposes
clf = svm.SVC(kernel='linear', C=1000)
clf.fit(X, y)
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)
# plot the decision function
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = clf.decision_function(xy).reshape(XX.shape)
# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5,
           linestyles=['--', '-', '--'])
# plot support vectors
ax.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s=100,
           linewidth=1, facecolors='none')
plt.show()

Prints:

Approximate the separating n-1 dimensional hyperplane of an SVM, Demo 2

import numpy as np
import mlpy
from sklearn import svm
from sklearn.svm import SVC
import matplotlib.pyplot as plt
np.random.seed(0)
mean1, cov1, n1 = [1, 5], [[1,1],[1,2]], 200  # 200 samples of class 1
x1 = np.random.multivariate_normal(mean1, cov1, n1)
y1 = np.ones(n1, dtype=np.int)

mean2, cov2, n2 = [2.5, 2.5], [[1,0],[0,1]], 300 # 300 samples of class -1
x2 = np.random.multivariate_normal(mean2, cov2, n2)
y2 = 0 * np.ones(n2, dtype=np.int)
X = np.concatenate((x1, x2), axis=0) # concatenate the 1 and -1 samples
y = np.concatenate((y1, y2))
clf = svm.SVC()
#fit the hyperplane between the clouds of data, should be fast as hell
clf.fit(X, y)
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, 
    decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)

production_point = [1., 2.5]

answer = clf.predict([production_point])
print("Answer: " + str(answer))
plt.plot(x1[:,0], x1[:,1], 'ob', x2[:,0], x2[:,1], 'or', markersize = 5)
colormap = ['r', 'b']
color = colormap[answer[0]]
plt.plot(production_point[0], production_point[1], 'o' + str(color), markersize=20)

#I want to draw the decision lines
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = clf.decision_function(xy).reshape(XX.shape)
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5,
           linestyles=['--', '-', '--'])
plt.show()

Prints:

These hyperplanes are all straight as an arrow, they're just straight in higher dimensions and can't be comprehended by mere mortals confined to 3 dimensional space. These hyperplanes are cast into higher dimensions with the creative kernel functions, than flattened back into the visible dimension for your viewing pleasure. Here is a video trying to impart some intuition of what is going on in demo 2: https://www.youtube.com/watch?v=3liCbRZPrZA