plot one of 500 trees in randomForest package

2019-07-25 12:08发布

问题:

How can plot trees in output of randomForest function in same names packages in R? For example I use iris data and want to plot first tree in 500 output tress. my code is

model <-randomForest(Species~.,data=iris,ntree=500)

回答1:

You can use the getTree() function in the randomForest package (official guide: https://cran.r-project.org/web/packages/randomForest/randomForest.pdf)

On the iris dataset:

require(randomForest)
data(iris)

## we have a look at the k-th tree in the forest
k <- 10
getTree(randomForest(iris[, -5], iris[, 5], ntree = 10), k, labelVar = TRUE)


回答2:

You may use cforest to plot like below, I have hardcoded the value to 5, you may change as per your requirement.

ntree <- 5
library("party")
cf <- cforest(Species~., data=iris,controls=cforest_control(ntree=ntree))

for(i in 1:ntree){
pt <- prettytree(cf@ensemble[[i]], names(cf@data@get("input"))) 
nt <- new("Random Forest BinaryTree") 
nt@tree <- pt 
nt@data <- cf@data 
nt@responses <- cf@responses 

pdf(file=paste0("filex",i,".pdf"))
plot(nt, type="simple")
dev.off()

}

cforest is another implementation of random forest, It can't be said which is better but in general there are few differences that we can see. The difference is that cforest uses conditional inferences where we put more weight to the terminal nodes in comparison to randomForest package where the implementation provides equal weights to terminal nodes.

In general cofrest uses weighted mean and randomForest uses normal average. You may want to check this .