How to get percentages from decision tree for each

2019-05-28 01:33发布

How could I create a table that includes the percentages for each node in the plot below?

library(rpart)
library(rattle)
library(rpart.plot)
library(RColorBrewer)

fit <- rpart(Species ~ ., data=iris, method="class")
fancyRpartPlot(fit)

It results in this plot:

image

I would like to output a table with species as the first column and the associated percent at each node in a second column. A second iteration of the table would exclude the first node (100%) and also remove duplicates by retaining the row that contains a higher percentage.

After picking through the "rpart" documentation I'm still unable to figure out how to create this table. Please let me know what you think.

Thank you for your time.

标签: r rpart
1条回答
一纸荒年 Trace。
2楼-- · 2019-05-28 01:42

The where element of the rpart-object is the predicted class for the terminal nodes. You can get this in a table with:

> iris$where <- fit$where
> with(iris, table(Species, where))
            where
Species       2  4  5
  setosa     50  0  0
  versicolor  0 49  1
  virginica   0  5 45

I'm guessing you want the column sums divided by the total counts?

> 100*colSums(with(iris, table(Species, where)) )/150
       2        4        5 
33.33333 36.00000 30.66667 
查看更多
登录 后发表回答