Failing to do fitting with non linear fitting meth

2020-07-28 11:18发布

问题:

I have a nls fitting task that I wanted to do with R. My first attempt to do this here and as @Roland pointed out

"The point is that complex models are difficult to fit. The more so, the less the data supports the model until it become impossible. You might be able to fit this, if you had extremely good starting values."

I can agree with @Roland but if excel can do this fitting why not R cannot do?

Basically this fitting can be done with Excel's GRG Nonlinear solver but the process is very time consuming and sometimes fitting is not good. (since there are a lots of data in reality).

here is my sample data.frame. I would like to fit each set group with the model provided at below,

set.seed(12345)
    set =rep(rep(c("1","2","3","4"),each=21),times=1)
    time=rep(c(10,seq(100,900,100),seq(1000,10000,1000),20000),times=1)
    value <- replicate(1,c(replicate(4,sort(10^runif(21,-6,-3),decreasing=FALSE))))
    data_rep <- data.frame(time, value,set)

     > head(data_rep)
            #    time        value set
            #1     10 1.007882e-06   1
            #2    100 1.269423e-06   1
            #3    200 2.864973e-06   1
            #4    300 3.155843e-06   1
            #5    400 3.442633e-06   1
            #6    500 9.446831e-06   1
            *      *       *         *

attempt 1

I already posted a question to here trouble-when-adding-3rd-fitting-parameter-in-nls

Basically the problem is that I wanted to do fitting in grouped data and doing prediction based on the fitting coefficients.

I used nlsLM from library(minpack.lm) I got an error

Error in nlsModel(formula, mf, start, wts, upper) : singular gradient matrix at initial parameter estimates

which was at first glance maybe the model error or my starting values were not good in accordance to @Roland. On the other hand, I could fit this model with only two fitting parameters. The problem arises when I wanted to add third parameter to the fitting function.

attempt 2

In that post trouble-when-adding-3rd-fitting-parameter-in-nls by following @G. Grothendieck suggestion, I tried nlxb from nlmrt package and fixed one of the parameter d to d=32 and do the fitting as follows;

formula = value~Ps*(1-exp(-2*f*time*exp(-d)))*1/(sqrt(2*pi*sigma))*exp(-(d-d_ave)^2/(2*sigma))*d_step

d_step <- 1
f <- 1e9
d <- 32    
library(plyr)
library(nlmrt)
get.coefs <- function(data_rep) {
fit <-  nlxb(formula ,
              data = data_rep,
              start=c(d_ave=44,sigma=12,Ps=0.5),
              lower=c(d_ave=25,sigma=2,Ps=0.5),
              upper=c(d_ave=60,sigma=15,Ps=1),
              trace=TRUE)
}

fit <- dlply(data_rep, c("set"), .fun = get.coefs)   # Fit data grouped by "set"

#    > fit
#    $`1`
#    nlmrt class object: x 
#    residual sumsquares =  1.474e-07  on  21 observations
#        after  12    Jacobian and  13 function evaluations
#      name            coeff          SE       tstat      pval      #gradient    JSingval   
#    d_ave            42.0126            NA         NA         NA  #-7.082e-15    0.001733  
#    sigma            12.8377            NA         NA         NA   #2.408e-15   1.289e-19  
#    Ps              0.973223            NA         NA         NA    #9.33e-15    3.37e-20  
#    
#    $`2`
#    nlmrt class object: x 
#    residual sumsquares =  6.2664e-08  on  21 observations
#        after  12    Jacobian and  13 function evaluations
#      name            coeff          SE       tstat      pval      #gradient    JSingval   
#    d_ave             42.246            NA         NA         NA  #-7.269e-15    0.001428  
#    sigma            12.7429            NA         NA         NA   #2.568e-15   3.098e-19  
#    Ps              0.981517            NA         NA         NA   #9.211e-15   2.746e-20  
#    
#    $`3`
#    nlmrt class object: x 
#    residual sumsquares =  1.773e-07  on  21 observations
#        after  12    Jacobian and  13 function evaluations
#      name            coeff          SE       tstat      pval      #gradient    JSingval   
#    d_ave             41.968            NA         NA         NA  #-6.438e-15    0.001798  
#    sigma            12.8561            NA         NA         NA   #2.173e-15   2.414e-19  
#    Ps              0.972988            NA         NA         NA   #8.534e-15   5.922e-20  

#    $`4`
#    nlmrt class object: x 
#    residual sumsquares =  2.5219e-07  on  21 observations
#        after  12    Jacobian and  13 function evaluations
#      name            coeff          SE       tstat      pval      #gradient    JSingval   
#    d_ave            41.8532            NA         NA         NA  #-4.454e-15    0.001976  
#    sigma            12.9045            NA         NA         NA   #1.474e-15   3.443e-19  
#    Ps              0.974319            NA         NA         NA   #5.987e-15   3.124e-20  

#    attr(,"split_type")
#    [1] "data.frame"
#    attr(,"split_labels")
#      set
#    1   1
#    2   2
#    3   3
#    4   4

fitting coefficients are reasonable wolaa! But this time I realized that (@G. Grothendieck also pointed out later) it is impossible to predict new values after nlxb (why=? I don't know!)

predvals <- ldply(fit, .fun=predictvals, xvar="time", yvar="value",xrange=range(range)) # predict values 

::you can find predictvals function from here

Error in UseMethod("predict") : no applicable method for 'predict' applied to an object of class "nlmrt"

There are NO! coef or predict methods for "nlmrt" class objects.

attempt 3

After following @G. Grothendieck another suggestion next I tried wrapnls from nlmrt.

because in this post he stated, can-we-make-prediction-with-nlxb-from-nlmrt-package

"because nlmrt package does provide wrapnls which will run nlmrt and then nls so that an "nls" object results and then that object can be used with all the "nls" class methods.

From the same nlmrt package still having trouble like at below

I give up to use plyr after my first post because loading plyr and dplyr is making my problem more complex. So I will stick with dplyr and use do function instead.

library(dplyr)
library(nlmrt)
formula = value~Ps*(1-exp(-2*f*time*exp(-d)))*1/(sqrt(2*pi*sigma))*exp(-(d-d_ave)^2/(2*sigma))*d_step

d_step <- 1
f <- 1e9
d <- 32

dffit = data_rep %>% group_by(set) %>%
  do(fit = wrapnls(formula ,
                data = .,
                start=c(d_ave=44,sigma=12,Ps=0.5),
                lower=c(d_ave=25,sigma=2,Ps=0.5),
                upper=c(d_ave=60,sigma=15,Ps=1),
                trace=TRUE))

Error in nlsModel(formula, mf, start, wts, upper) : singular gradient matrix at initial parameter estimates

I returned back to where I started with this error. I guess I tried everything I can do, looking for relevant examples (though only 3 ), read book and following the suggestions.

回答1:

Use nls2 from the nls2 package after nlxb like this (assuming data_rep, formula, d_step, f and d from the question). In order to make the example minimal we have eliminated dplyr and just show the computation for set == 2.

library(nlmrt)
library(nls2)

data_rep2 <- subset(data_rep, set == 2)

fit.nlxb <- nlxb(formula , data = data_rep2,
                start = c(d_ave = 44, sigma = 12, Ps = 0.5),
                lower = c(d_ave = 25, sigma = 2, Ps = 0.5),
                upper = c(d_ave = 60, sigma = 15, Ps = 1))

fit.nls <- nls2(formula, data = data_rep2, start = fit.nlxb$coefficients,
  algorithm = "brute-force")

identical(fit.nlxb$coefficients, coef(fit.nls))
## [1] TRUE

fit.nls is an "nls" class object with the same coefficients as fit.nlxb and we can use fitted() and predict() and all the other "nls" methods on it.