Finding contribution by each feature into making p

2019-08-20 06:39发布

I am trying to explain the decision taken by h2o GBM model. based on idea:https://medium.com/applied-data-science/new-r-package-the-xgboost-explainer-51dd7d1aa211 I want to calculate the contribution by each feature into making a certain decision at test time. Is it possible to get each individual tree from the ensable along with the log-odds at every node? also be needing the path traverse for each tree by model while making the prediction.

1条回答
叛逆
2楼-- · 2019-08-20 07:04

H2O doesn't have an equivalent xgboostExplainer package. However, there is a way to get something close.

1) if you want to know what decision path was taken for a single row/observation you can use h2o.predict_leaf_node_assignment(model, frame) to get an H2OFrame with the leaf node assignments which will generate something that looks like the following (showing the path for each tree built in the following case you can see that 5 trees were built):

enter image description here

2) you can visualize individual trees using H2O's MOJO which you can download once you've built your GBM or XGBoost model, which will look something like the following:

enter image description here

3) in an upcoming release you will be able to get the prediction value for each leaf node using the GBM (the pull request for this is here)

Putting all these steps together should get you pretty close to getting the values you want so you can add them up for your individual feature impact.(For a python jupyter notebook with examples on how to generate the leaf node assignments and visualize a tree look here)

查看更多
登录 后发表回答