Principal Component Analysis on Weka

2019-04-09 21:14发布

问题:

I have just computed PCA on a training set and Weka returned me the new attributes with the way in which they were selected and computed. Now, I want to build a model using these data and then use the model on a test set.

Do you know if there is a way to automatically modify the test set according to the new type of attributes?

回答1:

Do you need the principal components for analysis or just to feed into the classifier? If not just use the Meta->FilteredClassifier classifier. Set the filter to PrincipalComponents and and the classifier to whatever classifier you want to use. Train it on the un-transformed training set and you'll be able to just feed it the untransformed test set.

If you really need the modified test set I'd recommend using the knowledge flow tool to make something like this:



回答2:

To perform this from the command line, the documentation can be found at: https://weka.wikispaces.com/Batch+filtering

Here is an example:

java weka.filters.supervised.attribute.AttributeSelection \
  -b -i train.arff -o train_pca.arff \
  -r test.arff -s test_pca_output.arff \
  -E "weka.attributeSelection.PrincipalComponents -R 0.95 -A 5" \
  -S "weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N -1"