I have trained a RandomForestClassifier from Python Sckit Learn Module with very big dataset, but question is how can I possibly save this model and let other people apply it on their end. Thank you!
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
The recommended method is to use joblib
, this will result in a much smaller file than a pickle:
from sklearn.externals import joblib
joblib.dump(clf, 'filename.pkl')
#then your colleagues can load it
clf = joblib.load('filename.pk1')
See the online docs
回答2:
Have you tried pickling the RandomForestClassifier
using the Pickle module and then saving it to the disk?
Here’s an example based on the pickle docs:
import pickle
classifier = RandomForestClassifier(etc)
output = open('classifier.pkl', 'wb')
pickle.dump(classifier, output)
output.close()
The “other people” could then reload the pickled object as follows:
import pickle
f = open('classifier.pkl', 'rb')
classifier = pickle.load(f)
f.close()