Xgboost: what is the difference among bst.best_sco

2020-06-16 09:08发布

When I use xgboost to train my data for a 2-cates classification problem,I'd like to use the early stopping to get the best model, but I'm confused about which one to use in my predict as the early stop will return 3 different choices. For example, should I use

preds = model.predict(xgtest, ntree_limit=bst.best_iteration)

or should I use

preds = model.predict(xgtest, ntree_limit=bst.best_ntree_limit)

or both right, and they should be applied to different circumstances? If so, how can I judge which one to use?

Here is the original quotation of the xgboost document, but it didn't give the reason why and I also didn't find the comparison between those params:

Early Stopping

If you have a validation set, you can use early stopping to find the optimal number of boosting rounds. Early stopping requires at least one set in evals. If there's more than one, it will use the last.

train(..., evals=evals, early_stopping_rounds=10)

The model will train until the validation score stops improving. Validation error needs to decrease at least every early_stopping_rounds to continue training.

If early stopping occurs, the model will have three additional fields: bst.best_score, bst.best_iteration and bst.best_ntree_limit. Note that train() will return a model from the last iteration, not the best one. Pr ediction

A model that has been trained or loaded can perform predictions on data sets.
# 7 entities, each contains 10 features 
data = np.random.rand(7, 10) 
dtest = xgb.DMatrix(data) 
ypred = bst.predict(dtest)
If early stopping is enabled during training, you can get predictions from the best iteration with bst.best_ntree_limit:

ypred = bst.predict(dtest,ntree_limit=bst.best_ntree_limit)

Thanks in advance.

标签： python machine-learning xgboost

0条回答

Xgboost: what is the difference among bst.best_sco

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间