My question is relevant to this question. I am interested in computing Precision-Recall Curve (PRC) and area under PRC. I found a nice R package PRROC to do both tasks. According to package description (page 5) for function pr.curve, you have to give 2 parameters. 1) the classification scores of datapoints belonging to positive class only 2) the classification scores of datapoints belonging to negative class only (See manual page 7). The example they provide is:
# create artificial scores as random numbers
x <- rnorm( 1000 );
y <- rnorm( 1000, -1 );
# compute PR curve
pr <- pr.curve( x, y, curve = TRUE );
My problem is I have 14000 datapoints in positive class and 2560595 datapoints in negative class and for such data it is already being 1 day and still I haven't got results. For simplicity purpose you can try an extension of already given example.
# create artificial scores as random numbers
x <- rnorm( 14000 );
y <- rnorm( 2560595, -1 );
# compute PR curve
pr <- pr.curve( x, y, curve = TRUE );