Obtaining threshold values from a ROC curve

I have some models, using ROCR package on a vector of the predicted class percentages, I have a performance object. Plotting the performance object with the specifications "tpr", "fpr" gives me a ROC curve.

I'm comparing models at certain thresholds of false positive rate (x). I'm hoping to get the value of the true positive rate (y) out of the performance object. Even more, I would like to get the class percentage threshold that was used to generate that point.

the index number of the false positive rate (x-value) that is closest to the threshold without being above it, should give me the index number of the appropriate true positive rate (y-value). I'm not exactly sure how to get that index value.

And more to the point, how do i get the threshold of class probabilities that was used to make that point?

标签： r roc

3条回答

Fickle 薄情

2楼-- · 2019-01-13 01:17

2 solutions based on the ROCR and pROC packages:

threshold1 <- function(predict, response) {
    perf <- ROCR::performance(ROCR::prediction(predict, response), "sens", "spec")
    df <- data.frame(cut = perf@alpha.values[[1]], sens = perf@x.values[[1]], spec = perf@y.values[[1]])
    df[which.max(df$sens + df$spec), "cut"]
}
threshold2 <- function(predict, response) {
    r <- pROC::roc(response, predict)
    r$thresholds[which.max(r$sensitivities + r$specificities)]
}
data(ROCR.simple, package = "ROCR")
threshold1(ROCR.simple$predictions, ROCR.simple$labels)
#> [1] 0.5014893
threshold2(ROCR.simple$predictions, ROCR.simple$labels)
#> [1] 0.5006387

See also OptimalCutpoints package which provides many algorithms to find an optimal thresholds.

0人赞添加讨论(0) 举报

ら.Afraid

3楼-- · 2019-01-13 01:20

Package pROC includes function coords for calculating best threshold:

library(pROC)
my_roc <- roc(my_response, my_predictor)
coords(my_roc, "best", ret = "threshold")

0人赞添加讨论(0) 举报

姐就是有狂的资本

4楼-- · 2019-01-13 01:28

This is why str is my favorite R function:

library(ROCR)
data(ROCR.simple)
pred <- prediction( ROCR.simple$predictions, ROCR.simple$labels)
perf <- performance(pred,"tpr","fpr")
plot(perf)
> str(perf)
Formal class 'performance' [package "ROCR"] with 6 slots
  ..@ x.name      : chr "False positive rate"
  ..@ y.name      : chr "True positive rate"
  ..@ alpha.name  : chr "Cutoff"
  ..@ x.values    :List of 1
  .. ..$ : num [1:201] 0 0 0 0 0.00935 ...
      ..@ y.values    :List of 1
      .. ..$ : num [1:201] 0 0.0108 0.0215 0.0323 0.0323 ...
  ..@ alpha.values:List of 1
  .. ..$ : num [1:201] Inf 0.991 0.985 0.985 0.983 ...

Ahah! It's an S4 class, so we can use @ to access the slots. Here's how you make a data.frame:

cutoffs <- data.frame(cut=perf@alpha.values[[1]], fpr=perf@x.values[[1]], 
                      tpr=perf@y.values[[1]])
> head(cutoffs)
        cut         fpr        tpr
1       Inf 0.000000000 0.00000000
2 0.9910964 0.000000000 0.01075269
3 0.9846673 0.000000000 0.02150538
4 0.9845992 0.000000000 0.03225806
5 0.9834944 0.009345794 0.03225806
6 0.9706413 0.009345794 0.04301075

If you have an fpr threshold you want to hit, you can subset this data.frame to find maximum tpr below this fpr threshold:

cutoffs <- cutoffs[order(cutoffs$tpr, decreasing=TRUE),]
> head(subset(cutoffs, fpr < 0.2))
          cut       fpr       tpr
96  0.5014893 0.1495327 0.8494624
97  0.4997881 0.1588785 0.8494624
98  0.4965132 0.1682243 0.8494624
99  0.4925969 0.1775701 0.8494624
100 0.4917356 0.1869159 0.8494624
101 0.4901199 0.1962617 0.8494624

0人赞添加讨论(0) 举报

Obtaining threshold values from a ROC curve

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间