Could anybody tell me how could I compute Equal Error Rate(EER) from ROC Curve in python? In scikit-learn there is method to compute roc curve and auc but could not find the method to compute EER.
from sklearn.metrics import roc_curve, auc
ANSRWER:
I think I implemented myself.
The idea of ROC EER is the intersection point between a stright line joining
(1,0) and (0,1) and the roc Curve. It is a only point where it intersects. For a straight line with a=1 and b=1, the equation would be x+y =1 (x/a +y/b =1.0)
. So the intersection point would be the values of true positive rate (tpr) and false positive rate (fpr) which statisfies the following equation:
x + y - 1.0 = 0.0
Thus implemented the method as:
def compute_roc_EER(fpr, tpr):
roc_EER = []
cords = zip(fpr, tpr)
for item in cords:
item_fpr, item_tpr = item
if item_tpr + item_fpr == 1.0:
roc_EER.append((item_fpr, item_tpr))
assert(len(roc_EER) == 1.0)
return np.array(roc_EER)
So here one value is error rate and another value is accuracy.
May be somebody could help me to verify.
For any one else whom arrives here via a Google search. The Fran answer is incorrect as Gerhard points out. The correct code would be:
fpr, tpr, threshold = roc_curve(y, y_pred, pos_label=1)
fnr = 1 - tpr
eer_threshold = threshold(np.nanargmin(np.absolute((fnr - fpr))))
Note that this gets you the threshold at which the EER occurs not, the EER. The EER is defined as FPR = 1 - PTR = FNR. Thus to get the EER (the actual error rate) you could use the following:
EER = fpr(np.nanargmin(np.absolute((fnr - fpr))))
as a sanity check the value should be close to
EER = fnr(np.nanargmin(np.absolute((fnr - fpr))))
since this is an approximation.
Copying form How to compute Equal Error Rate (EER) on ROC by Changjiang:
from scipy.optimize import brentq
from scipy.interpolate import interp1d
from sklearn.metrics import roc_curve
fpr, tpr, thresholds = roc_curve(y, y_score, pos_label=1)
eer = brentq(lambda x : 1. - x - interp1d(fpr, tpr)(x), 0., 1.)
thresh = interp1d(fpr, thresholds)(eer)
That gave me correct EER value. Also remember that in the documentation it's written that y
is True binary labels in range {0, 1} or {-1, 1}. If labels are not binary, pos_label should be explicitly given and y_score
is Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).
To estimate the Equal Error Rate EER
you look for the point within the ROC
that makes the TPR
value equal to FPR
value, that is, TPR-FPR=0
. In other words you look for the minimum point of abs(TPR-FPR
)
- First of all you need to estimate the
ROC
curve:
fpr, tpr, threshold = roc_curve(y, y_pred, pos_label=1)
- To compute the
EER
in python you need only one line of code:
EER = threshold(np.argmin(abs(tpr-fpr)))
The EER is defined as FPR = 1 - PTR = FNR.
This is wrong.
Since FPR= 1-TNR (True Negative Rate) and therefore, not equal to FNR.