`warm_start` Parameter And Its Impact On Computati

2019-05-06 09:53发布

I have a logistic regression model with a defined set of parameters (warm_start=True).

As always, I call LogisticRegression.fit(X_train, y_train) and use the model after to predict new outcomes.

Suppose I alter some parameters, say, C=100 and call .fit method again using the same training data.

Theoretically, for the second time, I think .fit should take less computational time as compared to the model with warm_start=False. However, empirically is not actually true.

Please, help me understand the concept of warm_start parameter.

P.S.: I have also implemented SGDClassifier() for an experimentation.

标签： scikit-learn logistic-regression gradient-descent hyperparameters

1条回答

Deceive 欺骗

2楼-- · 2019-05-06 10:48

I hope you understand the concept of using the previous solution as an initialization for the following fit with warm_start=True.

Documentation states that warm_start parameter is useless with liblinear solver as there is no working implementation for a special linear case. To add, liblinear solver is a default choice for LogisticRegression which basically means that weights will be completely reinstantiated before each new fit.

To utilize warm_start parameter and reduce the computational time you should use one of the following solvers for your LogisticRegression:

newton-cg or lbfgs with a support of L2-norm penalty. They are also usually better with multiclassification problems;
sag or saga which converge faster on larger datasets than liblinear solver and use multinomial loss during descent.

Simple example

from sklearn.linear_model import LogisticRegression

X = [[1, 2, 3], [4, 5, 6], [1, 2, 3]]
y = [1, 0, 1]

# warm_start would work fine before each new fit
clf = LogisticRegression(solver='sag', warm_start=True)

clf.fit(X, y)

I hope that helps.

0人赞添加讨论(0) 举报

`warm_start` Parameter And Its Impact On Computati

Simple example

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间