I am trying to do multiple classification problem with Vowpal Wabbit.
I have a train file that look like this:
1 |feature_space
2 |feature_space
3 |feature_space
As an output I want to get probabilities of test item belonging to each class, like this:
1: 0.13 2:0.57 3:0.30
think of sklearn classifiers predict_proba methods, for example.
I've tried the following:
1) vw -oaa 3 train.file -f model.file --loss_function logistic --link logistic vw -p predict.file -t test.file -i model.file -raw_predictions = pred.txt
but the pred.txt file is empty (contains no records, but is created). Predict.file contains only the final class, and no probabilities.
2) vw - csoaa3 train.file -f model.file --link logistic I've modified the input files accordingly to fit the cs format. csoaa doesn't accept loss_function logistic with following error message: "You are using a label not -1 or 1 with a loss function expecting that!"
If used with default square loss function, and similar output command, I get pred.txt with raw predictions for each class per item, for example:
2.33 1.67 0.55
I believe it's the resulting square distance.
Is there a way to get VW to output class probabilites or somehow convert these distances into probabilities?
There was a bug in VW version 7.9.0 and fixed in 7.10.0 resulting in the empty raw predictions file.
Since November 2015, the easiest way how to obtain probabilities is to use
--oaa=N --loss_function=logistic --probabilities -p probs.txt
. (Or if you need label-dependent features:--csoaa_ldf=mc --loss_function=logistic --probabilities -p probs.txt
.)