Can't get partykit package's mob function

2019-09-03 11:48发布

I can't get the partykit package's mob function to do univariate MLE fit.

# Trying to convert vignette example here https://cran.r-project.org/web/packages/partykit/vignettes/mob.pdf on page 7 to do univariate MLE gamma fits.  
data("PimaIndiansDiabetes", package = "mlbench")    
library("partykit")     
library("fitdistrplus")    


# Generating some fake data to replace the example data.
op <- options(digits = 3)
set.seed(123)    
x <- rgamma(nrow(PimaIndiansDiabetes), shape = 5, rate = 0.1)
PimaIndiansDiabetes$diabetes<-x
PimaIndiansDiabetes$glucose<-x 

#Hopefully this change to the formula means fit a gamma to just the diabetes vector of values!
pid_formula <- diabetes  ~ 1  | pregnant + pressure + triceps + insulin + mass + pedigree + age    

#Defining my own, negative of log likelihood since mob will minimize it.    
estfun<-function(z) {-logLik(z)} 

#replacing the call to glm that is successful in the vignette example.    
class(fitdistr) <- append(class(fitdistr),estfun)              
logit <- function(y, x, start = NULL, weights = NULL, offset = NULL, ...) {
         fitdistr(y, "gamma") 
                  }

#fail! The mob() function still does not see my artificially created estfun().   

pid_tree <- mob(pid_formula, data = PimaIndiansDiabetes, fit = logit) 

Error in UseMethod("estfun") : no applicable method for 'estfun' applied to an object of class "fitdistr" The above error message does not appear when glm is used instead of fitdistr

# estfun runs OK outside of call to mob! 
estfun(logit(PimaIndiansDiabetes$diabetes,PimaIndiansDiabetes$glucose)) 

标签: r oop party
1条回答
Evening l夕情丶
2楼-- · 2019-09-03 11:57

In principle, it is feasible to use mob() for what you want to do but there is a misunderstanding of what the estfun() method is supposed to do and how it is being called.

mob() needs the following pieces of information from a model object to carry out the construction of the tree:

  • The estimated parameters, typically extracted by coef(object).
  • The optimized objective function, typically extracted by logLik(object).
  • The estimating functions aka scores aka gradient contributions, typically extracted by estfun(object). See vignette("sandwich-OOP", package = "sandwich") for an introduction.

For objects of class "fitdistr" the former two are available but the latter is not:

methods(class = "fitdistr")
## [1] coef   logLik print  vcov  
## see '?methods' for accessing help and source code

Hence:

f <- fitdistr(x, "gamma")
coef(f)
## shape  rate 
## 5.022 0.103 
logLik(f)
## 'log Lik.' -3404 (df=2)
sandwich::estfun(f)
## Error in UseMethod("estfun") : 
##   no applicable method for 'estfun' applied to an object of class "fitdistr"

The estfun() function you have defined does not work for the following two reasons: (1) It is not a method estfun.fitdistr() that could be called by the generic function sandwich::estfun() that is used through the package's NAMESPACE. (2) It does not compute the right quantity: it's the log-likelihood but we need the derivative of the log-density with respect to both parameters and evaluated at each observation. The latter would be an n x 2 matrix.

I think it shouldn't be too hard to compute the score function of the gamma distribution by hand. But this should also be available in some R package already, possibly gamlss.dist or also other packages.

查看更多
登录 后发表回答