Sklearn SGDC partial_fit ValueError: classes shoul

2019-08-16 08:20发布

问题:

loaded already trained SGDC model and tried to again partial_fit with new features set and data. but received ValueError: classes should include all valid labels that can be in y and my class_weights = None and wanted to have each class equal weights.

model_predicted_networktype = joblib.load(f)
new_training_data_count_matrix 
=count_vect_predicted_networktype.transform(training_dataset)
new_training_tf_idf = tf_idf(new_training_data_count_matrix)
model_predicted_networktype.partial_fit(new_training_tf_idf,training_labels)

I got the issue has I am adding new features to my already trained model and those are different what previously have fitted, but I need to add new features to already partial_fit data?

回答1:

Do classes=numpy.arange(some_estimated_max_number) in your first call to partial_fit and map the numbers to actual labels. This way you can add your data on the fly.