Libsvm : Wrong input format at line 1

2019-09-14 16:28发布

问题:

I am trying to use Libsvm and I got the following behaviour:

root@bcfd88c873fa:/home/libsvm# ./svm-train myfile
Wrong input format at line 1
root@bcfd88c873fa:/home/libsvm# head -n 5 myfile
2   0:0.00000 8:0.00193 2:0.00000 1:0.00000 10:0.00722
3   6:0.00235 2:0.00000 0:0.00000 1:0.00000 5:0.00155
4   0:0.00000 1:0.00000 2:0.00000 4:0.00187
3   6:0.00121 8:0.00211 1:0.00000 2:0.00000 0:0.00000
3   0:0.00000 2:0.00000 1:0.00000

Can you see anything wrong on the format ? It works with other svm implementation such as this one in Go.

Thanks,

回答1:

The provided format is correct. The Java interface of LIBSVM 3.22 does process the provided file as expected.

However, I also tried the Windows and Linux interfaces, which behave as described in your question.

svm-train.exe myfile
Wrong input format at line 1

After investigation, I found that the feature-ids have to be sorted to be properly processed by the tool (which seems to be a bug as the Java interface does not suffer from this restriction...):

2   0:0.00000 1:0.00000 2:0.00000 8:0.00193 10:0.00722
3   0:0.00000 1:0.00000 2:0.00000 5:0.00155 6:0.00235
4   0:0.00000 1:0.00000 2:0.00000 4:0.00187
3   0:0.00000 1:0.00000 2:0.00000 6:0.00121 8:0.00211
3   0:0.00000 1:0.00000 2:0.00000

Moreover, as LIBSVM uses sparse-data format you can simplify your dataset by skipping the features with a value of zero:

2   8:0.00193 10:0.00722
3   5:0.00155 6:0.00235
4   4:0.00187
3   6:0.00121 8:0.00211
3