I have binary classification problem where I want to calculate the roc_auc of the results. For this purpose, I did it in two different ways using sklearn. My code is as follows.
Code 1:
from sklearn.metrics import make_scorer
from sklearn.metrics import roc_auc_score
myscore = make_scorer(roc_auc_score, needs_proba=True)
from sklearn.model_selection import cross_validate
my_value = cross_validate(clf, X, y, cv=10, scoring = myscore)
print(np.mean(my_value['test_score'].tolist()))
I get the output as 0.60
.
Code 2:
y_score = cross_val_predict(clf, X, y, cv=k_fold, method="predict_proba")
from sklearn.metrics import roc_curve, auc
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(2):
fpr[i], tpr[i], _ = roc_curve(y, y_score[:,i])
roc_auc[i] = auc(fpr[i], tpr[i])
print(roc_auc)
I get the output as {0: 0.41, 1: 0.59}
.
I am confused since I get two different scores in the two codes. Please let me know why this difference happens and what is the correct way of doing this.
I am happy to provide more details if needed.
It seems that you used a part of my code from another answer, so I though to also answer this question.
For a binary classification case, you have 2 classes and one is the positive class.
For example see here.
pos_label
is the label of the positive class. Whenpos_label=None
, ify_true
is in{-1, 1}
or{0, 1}
,pos_label
is set to1
, otherwise an error will be raised..and