Can I get a list of wrong predictions in SVM score

2019-05-25 19:32发布

问题:

We can use svm.SVC.score() to evaluate the accuracy of the SVM model. I want to get the predicted class and the actual class in case of wrong predictions. How can I achieve this in scikit-learn?

回答1:

The simplest approach is just to iterate over your predictions (and correct classifications) and do whatever you want with the output (in the following example I will just print it to stdout).

Lets assume that your data is in inputs, labels, and your trained SVM is in clf, then you can just do

predictions = clf.predict(inputs)
for input, prediction, label in zip(inputs, predictions, labels):
  if prediction != label:
    print(input, 'has been classified as ', prediction, 'and should be ', label) 


回答2:

It depends on what form you want the incorrect predictions to be in. For most use cases a confusion matrix should be sufficient.

A confusion matrix is a plot of the actual class vs the predicted class, such that the diagonal of the graph is all of the correct predictions, and the remaining cells are the incorrect predictions.

You can see a better example of a confusion matrix on sklearn's Confusion Matrix example.

If you just want a list of all of the misclassified values with their predicted and actual classes, you can do something like the following.

Just select all of the rows of data where the actual and predicted classes are not equal.

import numpy as np
import pandas as pd

X = np.array([0.1, 0.34, 0.2, 0.98])
y = np.array(["A", "B", "A", "C"])

y_pred = np.array(["A", "C", "B", "C"])

df = pd.DataFrame(X, columns=["X"])
df["actual"] = y
df["predicted"] = y_pred

incorrect = df[df["actual"] != df["predicted"]]

In this case incorrect would contain the following entries.

      X actual predicted
1  0.34      B         C
2  0.20      A         B


回答3:

You can directly make a confusion matrix using sklearn. It gives a (2*2) matrix.

from sklearn import metrics

my_matrix = metrics.confusion_matrix(Y_test, Y_predicted)

Y_test: Array of your test class

Y_predicted: Array of predictions by your model

The cells of the confusion matrix will give you: True positive values, False Positive values, False Negative values and True Negative values.

Please have a look at this.