Random Forests - Probability Estimates (+scikit-le

2020-05-27 17:37发布

站内文章 / 后端开发

21 0

冷血范

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am interested in understanding how probability estimates are calculated by random forests, both in general and specifically in Python's scikit-learn library (where probability estimated are returned by the predict_proba function).

Thanks, Guy

回答1:

The probabilities returned by a forest are the mean probabilities returned by the trees in the ensemble (docs). The probabilities returned by a single tree are the normalized class histograms of the leaf a sample lands in.

回答2:

In addition to what Andreas/Dougal said, when you train the RF, turn on compute_importances=True. Then inspect classifier.feature_importances_ to see which features are occurring high-up in the RF's trees.