How to calculate AUC for One Class SVM in python?

2019-05-20 21:56发布

问题:

I have difficulty in plotting OneClassSVM's AUC plot in python (I am using sklearn which generates confusion matrix like [[tp, fp],[fn,tn]] with fn=tn=0.

from sklearn.metrics import roc_curve, auc
fpr, tpr, thresholds = roc_curve(y_test, y_nb_predicted)
roc_auc = auc(fpr, tpr) # this generates ValueError[1]
print "Area under the ROC curve : %f" % roc_auc
plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)

I want to handle error [1] and plot AUC for OneClassSVM.

[1] ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

回答1:

Please see my answer on a similar question. The gist is:

  • OneClassSVM fundamentally doesn't support converting a decision into a probability score, so you cannot pass the necessary scores into functions that require varying a score threshold, such as for ROC or Precision-Recall curves and scores.

  • You can approximate this type of score by computing the max value of your OneClassSVM's decision function across your input data, call it MAX, and then score the prediction for a given observation y by computing y_score = MAX - decision_function(y).

  • Use these scores to pass as y_score to functions such as average_precision_score, etc., which will accept non-thresholded scores instead of probabilities.

  • Finally, keep in mind that ROC will make less physical sense for OneClassSVM specifically because OneClassSVM is intended for situations where there is an expected and huge class imbalance (outliers vs. non-outliers), and ROC will not accurately up-weight the relative success on the small amount of outliers.