I am trying to calculate accuracy using ROCR package in R but the result is different than what I expected:
Assume I have prediction of a model (p) and label (l) as following:
p <- c(0.61, 0.36, 0.43, 0.14, 0.38, 0.24, 0.97, 0.89, 0.78, 0.86)
l <- c(1, 1, 1, 0, 0, 1, 1, 1, 0, 1)
And I am calculating accuracy of this prediction using following commands:
library(ROCR)
pred <- prediction(p, l)
perf <- performance(pred, "acc")
max(perf@y.values[[1]])
but the result is .8 which according to accuracy formula (TP+TN)/(TN+TP+FN+FP) should be .6 I don't know why?
When you use
max(perf@y.values[[1]])
, it is computing the maximum accuracy across any possible cutoff for predicting a positive.In your case, the optimal threshold is
p=0.2
, at which you make 2 mistakes (on the observations with predicted probabilities 0.38 and 0.78), yielding a maximum accuracy of 0.8.You can access the cutoffs for your perf object using
perf@x.values[[1]]
.