-->

rpart - Find number of leaves that a cp value to p

2019-03-04 17:41发布

问题:

I have a requirement where I need to group my categorical variables (having more than 5 category values) into 5 groups based on their association with my continuous variable. To achieve this I am using rpart with "annova" method.

So for example my categorical variable is type having codes 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 so I want to have 5 groups of this variable. After running the tree inorder to have only 5 groups I need to prune the tree. One way I tried is to use the nsplit from cptable but, nsplit of 5 might give me 7-8 leaves and similarly nsplit of 4 might give me 5-6 leaves.

I was looking for an option by which when I prune I get only 5 leaves which would act as my 5 groups.

Can someone please suggest how I can achieve this by using rpart.

Thank you !!