adabag boosting function throws error when giving

I have a strange issue, whenever I try increasing the mfinal argument in boosting function of adabag package beyond 10 I get an error, Even with mfinal=9 I get warnings.

My train data has 7 class Dependant variable and 100 independant variables and around 22000 samples of data(Smoted one class using DMwR). My Dependant Variable is at the end of the training dataset in sequence.

library(adabag)
gc()
exp_recog_boo <- boosting(V1 ~ .,data=train_dataS,boos=TRUE,mfinal=9)

Error in 1:nrow(object$splits) : argument of length 0
In addition: Warning messages:
1: In acum + acum1 :
longer object length is not a multiple of shorter object length

Thanks in advance.

标签： r adaboost

6条回答

淡お忘

2楼-- · 2019-05-06 19:55

This worked for me:

modelADA <- boosting(lettr ~ ., data = trainAll, boos = TRUE, mfinal = 10, control = (minsplit = 0))

Essentially I just told rpart to require a minimum split length of zero to generate tree, it eliminated the error. I haven't tested this extensively so I can't guarantee it's a valid solution (what does a tree with a zero length leaf actually mean?), but it does prevent the error from being thrown.

0人赞添加讨论(0) 举报

三岁会撩人

3楼-- · 2019-05-06 20:02

I also run into this same problem recently and this example R script solves it completely!

The main idea is that you need to set the control for rpart (which adabag uses for creating trees, see rpart.control) appropriately, so that at least a split is attempted in every tree.

I'm not totally sure but it appears that your "argument of length 0" may be the result of an empty tree, which can happen since there is a default setting of a "complexity" parameter that tells the function not to attempt a split if the decrease in homogeneity/lack of fit is below certain threshold.

0人赞添加讨论(0) 举报

做个烂人

4楼-- · 2019-05-06 20:05

I think i Hit the problem.

ignore this -if you configure your control with a cp = 0, this wont happen. I think that if the first node of a tree make no improvement (or at least no better than the cp) the tree stay wiht 0 nodes so you have an empty tree and that make the algorithm fail.

EDIT: The problem is that the rpart generates trees with only one leaf(node) and the boosting metod use this sentence "k <- varImp(arboles[[m]], surrogates = FALSE, competes = FALSE)" being arboles[[m]] a tree with only one node it give you the eror.

To solve that you can modify the boosting metod:

Write: fix(boosting) and add the *'S lines.

if (boos == TRUE) { 
**   k <- 1
**   while (k == 1){
     boostrap <- sample(1:n, replace = TRUE, prob = pesos)
     fit <- rpart(formula, data = data[boostrap, -1],
         control = control)
**   k <- length(fit$frame$var)
**   }
     flearn <- predict(fit, newdata = data[, -1], type = "class")
     ind <- as.numeric(vardep != flearn)
     err <- sum(pesos * ind)
 }

this will prevent the algorith from acepting one leaf trees but you have to set the CP from the control param as 0 to avoid an endless loop..

0人赞添加讨论(0) 举报

啃猪蹄的小仙女

5楼-- · 2019-05-06 20:05

use str() to see the attributes of your dataframe. For me, I just convert myclass variable as factor, then everything runs.

0人赞添加讨论(0) 举报

SAY GOODBYE

6楼-- · 2019-05-06 20:10

Just ran into the same problem, and setting the complexity parameter to -1 or minimum split to 0 both work for me with rpart.control, e.g.

library(adabag)

r1 <- boosting(Y ~ ., data = data, boos = TRUE, 
               mfinal = 10,  control = rpart.control(cp = -1))

r2 <- boosting(Y ~ ., data = data, boos = TRUE, 
               mfinal = 10,  control = rpart.control(minsplit = 0))

0人赞添加讨论(0) 举报

smile是对你的礼貌

7楼-- · 2019-05-06 20:12

My mistake was that I didn't set the TARGET as factor before.

Try this:

train$target <- as.factor(train$target)

and check by doing:

str(train$TARGET)

0人赞添加讨论(0) 举报

adabag boosting function throws error when giving

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间