Is it possible to train a sklearn model (eg SVM) i

2020-03-07 11:11发布

It is not really necessary (let alone efficient) to go to the other extreme and train instance by instance; what you are looking for is actually called incremental or online learning, and it is available in scikit-learn's SGDClassifier for linear SVM and logistic regression, which indeed contains a partial_fit method.

Here is a quick example with dummy data:

import numpy as np
from sklearn import linear_model
X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
Y = np.array([1, 1, 2, 2])
clf = linear_model.SGDClassifier(max_iter=1000, tol=1e-3)

clf.partial_fit(X, Y, classes=np.unique(Y))

X_new = np.array([[-1, -1], [2, 0], [0, 1], [1, 1]])
Y_new = np.array([1, 1, 2, 1])
clf.partial_fit(X_new, Y_new)

The default values for the loss and penalty arguments ('hinge' and 'l2' respectively) are these of a LinearSVC, so the above code essentially fits incrementally a linear SVM classifier with L2 regularization; these settings can of course be changed - check the docs for more details.

It is necessary to include the classes argument in the first call, which should contain all the existing classes in your problem (even though some of them might not be present in some of the partial fits); it can be omitted in subsequent calls of partial_fit - again, see the linked documentation for more details.