Can I extract the underlying decision-rules (or 'decision paths') from a trained tree in a decision tree as a textual list?
Something like:
if A>0.4 then if B<0.2 then if C>0.8 then class='X'
Thanks for your help.
Can I extract the underlying decision-rules (or 'decision paths') from a trained tree in a decision tree as a textual list?
Something like:
if A>0.4 then if B<0.2 then if C>0.8 then class='X'
Thanks for your help.
This is the code you need
I have modified the top liked code to indent in a jupyter notebook python 3 correctly
I believe that this answer is more correct than the other answers here:
This prints out a valid Python function. Here's an example output for a tree that is trying to return its input, a number between 0 and 10.
Here are some stumbling blocks that I see in other answers:
tree_.threshold == -2
to decide whether a node is a leaf isn't a good idea. What if it's a real decision node with a threshold of -2? Instead, you should look attree.feature
ortree.children_*
.features = [feature_names[i] for i in tree_.feature]
crashes with my version of sklearn, because some values oftree.tree_.feature
are -2 (specifically for leaf nodes).Modified Zelazny7's code to fetch SQL from the decision tree.
Just use the function from sklearn.tree like this
And then look in your project folder for the file tree.dot, copy the ALL the content and paste it here http://www.webgraphviz.com/ and generate your graph :)
Codes below is my approach under anaconda python 2.7 plus a package name "pydot-ng" to making a PDF file with decision rules. I hope it is helpful.
a tree graphy show here
I created my own function to extract the rules from the decision trees created by sklearn:
This function first starts with the nodes (identified by -1 in the child arrays) and then recursively finds the parents. I call this a node's 'lineage'. Along the way, I grab the values I need to create if/then/else SAS logic:
The sets of tuples below contain everything I need to create SAS if/then/else statements. I do not like using
do
blocks in SAS which is why I create logic describing a node's entire path. The single integer after the tuples is the ID of the terminal node in a path. All of the preceding tuples combine to create that node.