mgcv: How to return estimated smoothing parameter?

2019-07-31 16:24发布

问题:

Consider the simple GAM fit as below:

library(mgcv)
my.gam <- gam(y~s(x), data=mydata)
  1. Is there anyway to return the estimated smoothing parameter (lambda) so that I can save it? I know that lambda is given in output as 'GCV score', but I need a specific code for returning it.
  2. How can I set lambda to a desired value?

回答1:

summary() does not return smoothing parameters. You have mixed up GCV score with smoothing parameter. Consult a local statistician if you don't understand those concepts, or raise a question on Cross Validated. I will only show you how to extract and set smoothing parameters.

Consider an example:

library(mgcv)
set.seed(2)
dat <- gamSim(1, n=400, dist="normal", scale=2)
b <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data=dat)

You can get internal smoothing parameters from:

b$sp
#       s(x0)        s(x1)        s(x2)        s(x3) 
#3.648590e+00 3.850127e+00 1.252710e-02 4.986399e+10 

But these are not lambda. They differ from lambda by some positive scaling factors. It is usually sufficient to use sp for smoothing parameters. If you want to set it to a fixed value, do for example:

b1 <- gam(y ~ s(x0, sp = 0) + s(x1, sp = 0) + s(x2, sp = 0) + s(x3, sp = 0),
          data = dat)

This essentially disables penalization for all smooth terms. Note that setting sp to a negative value implies auto-selection of sp.


lambda and sp:

sapply(b$smooth, "[[", "S.scale") / b$sp
#       s(x0)        s(x1)        s(x2)        s(x3) 
#6.545005e+00 5.326938e+00 1.490702e+03 4.097379e-10 

Sometimes getting lambda is necessary. When considering smooth functions as random effects or random fields, there is

variance_parameter_of_random_effect = scale_parameter / lambda

where the scale parameter can be found in b$scale (for Gaussian model this is also b$sig2). See a related question: GAM with "gp" smoother: how to retrieve the variogram parameters?


Follow-up

Yes, I need the exact value of lambda, so thanks for the neat code. Yet I'm interested in knowing more about the scaling factor. Where can I read more about it in addition to the package manual?

Have a read on ?smoothCon:

smoothCon(object,data,knots=NULL,absorb.cons=FALSE,
          scale.penalty=TRUE,n=nrow(data),dataX=NULL,
          null.space.penalty=FALSE,sparse.cons=0,
          diagonal.penalty=FALSE,apply.by=TRUE,modCon=0)

scale.penalty: should the penalty coefficient matrix be scaled to have
          approximately the same 'size' as the inner product of the
          terms model matrix with itself? ...

In the source code of smoothCon, there is:

if (scale.penalty && length(sm$S) > 0 && is.null(sm$no.rescale)) {
    maXX <- norm(sm$X, type = "I")^2
    for (i in 1:length(sm$S)) {
        maS <- norm(sm$S[[i]])/maXX
        sm$S[[i]] <- sm$S[[i]]/maS
        sm$S.scale[i] <- maS
    }
}

Briefly speaking, for a model matrix X and raw penalty matrix S, scaling factor maS is:

norm(S) / norm(X, type = "I")^2


标签: r gam mgcv