Python Decision Tree GraphViz

2019-08-12 23:54发布

问题:

I'm attempting to implement a Decision Tree with scikit learn and then visualise the tree with Graphviz which I understand is the standard choice for visualising DT. I'm using PyCharm, anaconda, Python 2.7 and OS X El Capitan. I've installed pydot and Graphviz with PIP install as far as I can tell and have also installed them directly in Pycharm but whatever I do I continuously get a 'No module named graphviz'.

from sklearn.datasets import load_iris
from sklearn import tree
#import graphviz as gv
# uncommenting the row above produces an error
clf = tree.DecisionTreeClassifier()
iris = load_iris()
clf = clf.fit(iris.data, iris.target)
with open('graph.dot', 'w') as file:
    tree.export_graphviz(clf, out_file = file)
file.close()

At the moment running this code produces the graph.dot but I cannot view the file. 1. How do I get the graphviz repository to work? 2. How do I write the graph to PDF/PNG? I saw some examples but non-worked 3. I found this command: dot -Tps filename.dot -o outfile.ps Where do I used it? And how can I verify a dot utility exists on my OS X?

Thanks in advance!

回答1:

I'm pretty sure I installed graphviz using homebrew, but it looks like you can also download a binary from http://www.graphviz.org/Download_macos.php. If you can't get pydot to work, you'll need to run the dot command from the terminal, or in you script using subprocess:

import subprocess
subprocess.call(['dot', '-Tpdf', 'tree.dot', '-o' 'tree.pdf'])


回答2:

You can also use following code for exporting to pdf.

First install pydot2

pip install pydot2

Then you can use following code:

from sklearn.datasets import load_iris
from sklearn import tree
clf = tree.DecisionTreeClassifier()
iris = load_iris()
clf = clf.fit(iris.data, iris.target)

from sklearn.externals.six import StringIO
import pydot 

dot_data = StringIO() 
tree.export_graphviz(clf, out_file=dot_data) 
graph = pydot.graph_from_dot_data(dot_data.getvalue()) 
graph.write_pdf("graph.pdf") 


回答3:

If you don't have/want graphviz on your system you can also open the .dot files as text and copy the content to webgraphviz which will then create and display the tree for you.

The result is not a picture or file that you can save, though, and you'd have to do this manually for every tree you created. For more complicated and/or batch tree building you'll need the actual graphviz on your system, so you can call the dot program either from terminal or directly out of Python, as maxymoo described.