How to save scikit-learn MULTIPLE classifier model

2019-08-18 01:19发布

问题:

This question already has an answer here:

  • Saving and loading multiple objects in pickle file? 6 answers

In general, we could use pickle to save ONE classifier model. Is there a way to save MULTIPLE classifier models in one pickle? If yes, how could we save the model and retrieve it later?

For instance, (the minimum working example)

from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from numpy.random import rand, randint 

models = []
models.append(('LogisticReg', LogisticRegression(random_state=123)))
models.append(('DecisionTree', DecisionTreeClassifier(random_state=123)))
# evaluate each model in turn
results_all = []
names = []
dict_method_score = {}
scoring = 'f1'

X = rand(8, 4)
Y = randint(2, size=8)

print("Method: Average (Standard Deviation)\n")
for name, model in models:
    kfold = model_selection.KFold(n_splits=2, random_state=999)
    cv_results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
    results_all.append(cv_results)
    names.append(name)
    dict_method_score[name] = (cv_results.mean(), cv_results.std())
    print("{:s}: {:.3f} ({:.3f})".format(name, cv_results.mean(), cv_results.std()))

Purpose: Change some hyperparameters (say n_splits in cross validation) using the same setup and retrieve the model later.

回答1:

You can save multiple objects into the same pickle:

with open("models.pckl", "wb") as f:
    for model in models:
         pickle.dump(model, f)

You can then load back your models into memory one at a time:

models = []
with open("models.pckl", "rb") as f:
    while True:
        try:
            models.append(pickle.load(f))
        except EOFError:
            break