Calculating AUC when using Vowpal Wabbit

2019-01-18 13:51发布

问题:

Is there anyway to compute AUC within Vowpal Wabbit?

One of the reasons I am using Vowpal Wabbit is the large size of the data file. I can calculate the AUC outside of the Vowpal Wabbit environment using the output of Vowpal Wabbit but this might be problematic if the data file is large.

回答1:

Currently, VW cannot report AUC. What is worse, it cannot optimize directly for AUC. Optimizing for AUC is not compatible with online learning, but there are some approximations of AUC suitable for optimizing.

Concerning your question, you don't need to store the intermediate file with raw predictions on disk. You can pipe it directly to the external evaluation tool (perf in this case):

vw -d test.data -t -i model.vw -r /dev/stdout | perf -roc -files gold /dev/stdin

Edit: John Langford confirmed that AUC can generally be optimized by changing the ratio of false positive and false negative loss. In VW, this means setting a different importance weight for positive and negative examples. You need to tune the optimal weight using a hold out set (or cross validation, or progressive validation loss for one-pass learning).