I have used weka and made a Naive Bayes classifier, by using weka GUI. Then I have saved this model by following this tutorial. Now I want to load this model through Java code but I am unable to find any way to load a saved model using weka.
This is my requirement that I have to made model separately and then use it in a separate program.
If anyone can guide me in this regard I will be thankful to you.
You can easily load a saved model in java using this command:
For a complete workflow in Java I wrote the following article in SO Documentation, now copied here:
Text Classification in Weka
Text Classification with LibLinear
Create training instances from .arff file
Use StringToWordVector to transform your string attributes to number representation:
Important features of this filter:
Apply the filter to trainingData:
trainingData = Filter.useFilter(trainingData, filter);
Create the LibLinear Classifier
Set
setProbabilityEstimates(true)
to print the output probabilitiesClassifier cls = null; LibLINEAR liblinear = new LibLINEAR(); liblinear.setSVMType(new SelectedTag(0, LibLINEAR.TAGS_SVMTYPE)); liblinear.setProbabilityEstimates(true); // liblinear.setBias(1); // default value cls = liblinear; cls.buildClassifier(trainingData);
Save model
System.out.println("Saving the model..."); ObjectOutputStream oos; oos = new ObjectOutputStream(new FileOutputStream(path+"mymodel.model")); oos.writeObject(cls); oos.flush(); oos.close();
Create testing instances from
.arff
fileInstances trainingData = getDataFromFile(pathToArffFile);
Load classifier
Classifier myCls = (Classifier) weka.core.SerializationHelper.read(path+"mymodel.model");
Use the same StringToWordVector filter as above or create a new one for testingData, but remember to use the trainingData for this command:
filter.setInputFormat(trainingData);
This will make training and testing instances compatible. Alternatively you could useInputMappedClassifier
Apply the filter to testingData:
testingData = Filter.useFilter(testingData, filter);
Classify!
1.Get the class value for every instance in the testing set
for (int j = 0; j < testingData.numInstances(); j++) { double res = myCls.classifyInstance(testingData.get(j)); }
res
is a double value that corresponds to the nominal class that is defined in.arff
file. To get the nominal class use :testintData.classAttribute().value((int)res)
2.Get the probability distribution for every instance
dist
is a double array that contains the probabilities for every class defined in.arff
fileNote. Classifier should support probability distributions and enable them with:
myClassifier.setProbabilityEstimates(true);