-->

Data fitting by the method of maximum likelihood f

2019-08-25 09:42发布

问题:

I would like to know how one is able to fit any distribution to a given set of data using the method of MLEs. As a particular example, could anyone suggest a working code that would give the correct results for the MLEs for $\theta$ and $\beta$ when the generalised Lindley distribution described in https://rivista-statistica.unibo.it/article/viewFile/6836/7039 is applied to the data: 5.1, 6.3, 10.8, 12.1, 18.5, 19.7, 22.2, 23, 30.6, 37.3, 46.3, 53.9, 59.8, 66.2 on pg. 156? How can this then be used to fit the distribution over a histogram?

回答1:

I'm going to answer a now-deleted question of yours that is very similar, based on a problem in Parari et al. (data on air conditioning from Proschan (Table 7.2, results in Table 7.5)

In general you need to know reasonable starting values in order to be able to do general-purpose maximum-likelihood estimation (this is admittedly a chicken-and-egg problem, but there are lots of solutions [use the method of moments, pick reasonable values based on the problem, eyeball a graph to pick values, etc.]. In this case, since the authors gave you the answers, you can use those parameters to pick starting values of the right order of magnitude. More generally, you need to know something about reasonable orders of magnitude in order to get started.

Note that there are lots of fussy numerical issues with general maximum likelihood estimation: e.g. see chapter 7 here ...

Data (x) and likelihood function (dBEPL) are defined below. I am defining the density function and using the formula interface to mle2() ...

dd  <- data.frame(x)
library(bbmle)
## parameters from table
ovec <- list(alpha=.7945,beta=0.1509,omega=6.7278,
              a=0.2035,b=0.2303)
## starting vals -- right order of magnitude
svec <- list(alpha=0.8,beta=0.2,omega=10,a=0.2,b=0.2)
m1 <- mle2( x ~ dBEPL(alpha,beta,omega,a,b),
      data=dd,
      start=svec,
      control=list(parscale=unlist(svec)))
coef(m1)  ## NOT the same as reported, but close
par(las=1,bty="l")
hist(x,col="gray",freq=FALSE,breaks=40,ylim=c(0.,0.014))
with(as.list(coef(m1)),curve(dBEPL(x,alpha,beta,omega,a,b,log=FALSE),add=TRUE,
                             col=2,lwd=2,n=201))


x <- scan(textConnection("
194 413 90 74 55 23 97 50 359 50 130 487 57 102 
15 14 10 57 320 261 51 44 9 254 493 33 18 209 41 
58 60 48 56 87 11 102 12 5 14 14 29 37 186 29 104 
7 4 72 270 283 7 61 100 61 502 220 120 141 22 603 35 
98 54 100 11 181 65 49 12 239 14 18 39 3 12 5 32 9 438 
43 134 184 20 386 182 71 80 188 230 152 5 36 79 59 33 
246 1 79 3 27 201 84 27 156 21 16 88 130 14 118 44 15 42 
106 46 230 26 59 153 104 20 206 566 34 29 26 35 5 82 31 
118 326 12 54 36 34 18 25 120 31 22 18 216 139 67 310 
346 210 57 76 14 111 97 62 39 30 7 44 11 63 23 22 23 14 
18 13 34 16 18 130 90 163 208 124 70 16 101 52 
208 95 62 11 191 14 71"))


dBEPL <- function(x,alpha,beta,omega,a,b,log=TRUE) {
    r <- log(alpha*beta^2*omega/(beta(a,b)*(beta+1)))+
           log(1+x^alpha)+(alpha-1)*log( x)-beta* x^alpha+(omega*a-1) *
           log(1-(1+beta* x^alpha/(beta+1))*exp(-beta* x^alpha))+
           (b-1)*log(1-(1-(1+beta* x^alpha/(beta+1))*exp(-beta* x^alpha))^omega)
    if (log) return(r) else return(exp(r))
}


标签: r histogram mle