I am trying to plot a ROC curve to evaluate the accuracy of a prediction model I developed in Python using logistic regression packages. I have computed the true positive rate as well as the false positive rate; however, I am unable to figure out how to plot these correctly using matplotlib and calculate the AUC value. How could I do that?

标签： python matplotlib plot statistics roc

8条回答

趁早两清

2楼-- · 2019-03-08 02:25

The previous answers assume that you indeed calculated TP/Sens yourself. It's a bad idea to do this manually, it's easy to make mistakes with the calculations, rather use a library function for all of this.

the plot_roc function in scikit_lean does exactly what you need: http://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html

The essential part of the code is:

  for i in range(n_classes):
      fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])
      roc_auc[i] = auc(fpr[i], tpr[i])

0人赞添加讨论(0) 举报

放我归山

3楼-- · 2019-03-08 02:26

Here are two ways you may try, assuming your model is an sklearn predictor:

import sklearn.metrics as metrics
# calculate the fpr and tpr for all thresholds of the classification
probs = model.predict_proba(X_test)
preds = probs[:,1]
fpr, tpr, threshold = metrics.roc_curve(y_test, preds)
roc_auc = metrics.auc(fpr, tpr)

# method I: plt
import matplotlib.pyplot as plt
plt.title('Receiver Operating Characteristic')
plt.plot(fpr, tpr, 'b', label = 'AUC = %0.2f' % roc_auc)
plt.legend(loc = 'lower right')
plt.plot([0, 1], [0, 1],'r--')
plt.xlim([0, 1])
plt.ylim([0, 1])
plt.ylabel('True Positive Rate')
plt.xlabel('False Positive Rate')
plt.show()

# method II: ggplot
from ggplot import *
df = pd.DataFrame(dict(fpr = fpr, tpr = tpr))
ggplot(df, aes(x = 'fpr', y = 'tpr')) + geom_line() + geom_abline(linetype = 'dashed')

or try

ggplot(df, aes(x = 'fpr', ymin = 0, ymax = 'tpr')) + geom_line(aes(y = 'tpr')) + geom_area(alpha = 0.2) + ggtitle("ROC Curve w/ AUC = %s" % str(roc_auc))

0人赞添加讨论(0) 举报

霸刀☆藐视天下

4楼-- · 2019-03-08 02:26

It is not at all clear what the problem is here, but if you have an array true_positive_rate and an array false_positive_rate, then plotting the ROC curve and getting the AUC is as simple as:

import matplotlib.pyplot as plt
import numpy as np

x = # false_positive_rate
y = # true_positive_rate 

# This is the ROC curve
plt.plot(x,y)
plt.show() 

# This is the AUC
auc = np.trapz(y,x)

0人赞添加讨论(0) 举报

来，给爷笑一个

5楼-- · 2019-03-08 02:27

from sklearn import metrics
import numpy as np
import matplotlib.pyplot as plt

y_true = # true labels
y_probas = # predicted results
fpr, tpr, thresholds = metrics.roc_curve(y_true, y_probas, pos_label=0)

# Print ROC curve
plt.plot(fpr,tpr)
plt.show() 

# Print AUC
auc = np.trapz(tpr,fpr)
print('AUC:', auc)

0人赞添加讨论(0) 举报

Animai°情兽

6楼-- · 2019-03-08 02:33

AUC curve For Binary Classification using matplotlib

from sklearn import svm, datasets
from sklearn import metrics
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer
import matplotlib.pyplot as plt

Load Breast Cancer Dataset

breast_cancer = load_breast_cancer()

X = breast_cancer.data
y = breast_cancer.target

Split the Dataset

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.33, random_state=44)

Model

clf = LogisticRegression(penalty='l2', C=0.1)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)

Accuracy

print("Accuracy", metrics.accuracy_score(y_test, y_pred))

AUC Curve

y_pred_proba = clf.predict_proba(X_test)[::,1]
fpr, tpr, _ = metrics.roc_curve(y_test,  y_pred_proba)
auc = metrics.roc_auc_score(y_test, y_pred_proba)
plt.plot(fpr,tpr,label="data 1, auc="+str(auc))
plt.legend(loc=4)
plt.show()

0人赞添加讨论(0) 举报

该账号已被封号

7楼-- · 2019-03-08 02:35

Here is a python code :

import matplotlib.pyplot as plt
import numpy as np

score = np.array([0.9, 0.8, 0.7, 0.6, 0.55, 0.54, 0.53, 0.52, 0.51, 0.505, 0.4, 0.39, 0.38, 0.37, 0.36, 0.35, 0.34, 0.33, 0.30, 0.1])
y = np.array([1,1,0, 1, 1, 1, 0, 0, 1, 0, 1,0, 1, 0, 0, 0, 1 , 0, 1, 0])

roc_x = []
roc_y = []
min_score = min(score)
max_score = max(score)
thr = np.linspace(min_score, max_score, 30)
FP=0
TP=0
N = sum(y)
P = len(y) - N

for (i, T) in enumerate(thr):
    for i in range(0, len(score)):
        if (score[i] > T):
            if (y[i]==1):
                TP = TP + 1
            if (y[i]==0):
                FP = FP + 1
    roc_x.append(FP/float(N))
    roc_y.append(TP/float(P))
    FP=0
    TP=0

plt.scatter(roc_x, roc_y)
plt.show()

More reference

0人赞添加讨论(0) 举报

1 2 下一页

How to plot ROC curve in Python

AUC curve For Binary Classification using matplotlib

Load Breast Cancer Dataset

Split the Dataset

Model

Accuracy

AUC Curve

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间