-->

Precision-Recall Curve computation by PRROC packag

2020-05-09 17:55发布

问题:

My question is relevant to this question. I am interested in computing Precision-Recall Curve (PRC) and area under PRC. I found a nice R package PRROC to do both tasks. According to package description (page 5) for function pr.curve, you have to give 2 parameters. 1) the classification scores of datapoints belonging to positive class only 2) the classification scores of datapoints belonging to negative class only (See manual page 7). The example they provide is:

# create artificial scores as random numbers
x <- rnorm( 1000 );
y <- rnorm( 1000, -1 );
# compute PR curve
pr <- pr.curve( x, y, curve = TRUE );

My problem is I have 14000 datapoints in positive class and 2560595 datapoints in negative class and for such data it is already being 1 day and still I haven't got results. For simplicity purpose you can try an extension of already given example.

# create artificial scores as random numbers
x <- rnorm( 14000 );
y <- rnorm( 2560595, -1 );
# compute PR curve
pr <- pr.curve( x, y, curve = TRUE );

回答1:

You may want to try AUPRC() from PerfMeas package

Edited

This precrec package seems to be better. It is compatible with ggplot2 and implemented with C++. For benchmark result please check this paper