I am doing a project using libsvm and I am preparing my data to use the lib. How can I convert CSV file to LIBSVM compatible data?
CSV File: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/data/iris.csv
In the frequencies questions:
How to convert other data formats to LIBSVM format?
It depends on your data format. A simple way is to use libsvmwrite in the libsvm matlab/octave interface. Take a CSV (comma-separated values) file in UCI machine learning repository as an example. We download SPECTF.train. Labels are in the first column. The following steps produce a file in the libsvm format.
matlab> SPECTF = csvread('SPECTF.train'); % read a csv file
matlab> labels = SPECTF(:, 1); % labels from the 1st column
matlab> features = SPECTF(:, 2:end);
matlab> features_sparse = sparse(features); % features must be in a sparse matrix
matlab> libsvmwrite('SPECTFlibsvm.train', labels, features_sparse);
The tranformed data are stored in SPECTFlibsvm.train.
Alternatively, you can use convert.c to convert CSV format to libsvm format.
but I don't wanna use matlab, I use python.
I found this solution as well using JAVA
Can anyone recommend a way to tackle this problem ?
You can use csv2libsvm.py to convert
csv
tolibsvm data
where 4 means
target index
, andTrue
meanscsv
has a header.Finally, you can get
libsvm.data
asfrom
iris.csv
csv2libsvm.py does not work with Python3, and also it does not support label targets (string targets), I have slightly modified it. Now It should work with Python3 as well as label targets. I am very new to Python, so my code maybe not best practice, but I hope can help to someone.