predict continuous values using sklearn bagging cl

2019-09-12 04:42发布

问题:

Can I use sklearn's BaggingClassifier to produce continuous predictions? Is there a similar package? My understanding is that the bagging classifier predicts several classifications with different models, then reports the majority answer. It seems like this algorithm could be used to generate probability functions for each classification then reporting the mean value.

trees = BaggingClassifier(ExtraTreesClassifier())
trees.fit(X_train,Y_train)
Y_pred = trees.predict(X_test)

回答1:

If you're interested in predicting probabilities for the classes in your classifier, you can use the predict_proba method, which gives you a probability for each class. It's a one-line change to your code:

trees = BaggingClassifier(ExtraTreesClassifier())
trees.fit(X_train,Y_train)
Y_pred = trees.predict_proba(X_test)

The shape of Y_pred will be [n_samples, n_classes].

If your Y_train values are continuous and you want to predict those continuous values (i.e., you're working on a regression problem), then you can use the BaggingRegressor instead.



回答2:

I typically use BaggingRegressor() for continuous values, and then compare performance with RMSE. example below:

from sklearn.ensemble import BaggingReressor
trees = BaggingRegressor()
trees.fit(X_train,Y_train)
scores_RMSE = math.sqrt(metrics.mean_squared_error(Y_test, trees.predict(X_test))