How to extract fitted values of GAM {mgcv} for eac

2019-03-22 09:08发布

I'm searching for a method to add the predicted (real, not standardized) values of every single variable in my model

> model<-gam(LN_Brutto~s(agecont,by=Sex)+factor(Sex)+te(Month,Age)+s(Month,by=Sex), data=bears)

This is the summary of my model:

> summary(m13)

Family: gaussian 
Link function: identity 

Formula:
LN_Brutto ~ s(agecont, by = Sex) + factor(Sex) + te(Month, Age) + 
    s(Month, by = Sex)

Parametric coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)   4.32057    0.01071  403.34   <2e-16 ***
factor(Sex)m  0.27708    0.01376   20.14   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                    edf  Ref.df      F  p-value    
s(agecont):Sexf  8.1611  8.7526 20.170  < 2e-16 ***
s(agecont):Sexm  6.6695  7.5523 32.689  < 2e-16 ***
te(Month,Age)   10.3651 12.7201  6.784 2.19e-12 ***
s(Month):Sexf    0.9701  0.9701  0.641    0.430    
s(Month):Sexm    1.3750  1.6855  0.193    0.787    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Rank: 60/62
R-sq.(adj) =  0.781   Deviance explained = 78.7%
GCV = 0.048221  Scale est. = 0.046918  n = 1093

predicted values are provided by this code:

> predict<-predict(m13, type = "terms")

And the result looks like this:

    factor(Sex) s(agecont):Sexf s(agecont):Sexm te(Month,Age)   s(Month):Sexf   s(Month):Sexm
1   0.2770806   0.000000000     0.111763696     -0.077845764    0.000000000     0.0007840912
2   0.2770806   0.000000000     0.240016156     -0.049143798    0.000000000     0.0007840912
3   0.2770806   0.000000000     0.034328752     0.046524454     0.000000000     -0.0058871897
4   0.0000000   -0.786533918    0.000000000     -0.067942427    0.021990192     0.0000000000
5   0.0000000   0.074434715     0.000000000     0.046524454     0.021990192     0.0000000000
6   0.0000000   0.161121563     0.000000000     0.089599601     0.021990192     0.0000000000
7   0.0000000   0.074434715     0.000000000     0.046524454     0.021990192     0.0000000000
8   0.2770806   0.000000000     -0.298597370    -0.007877328    0.000000000     -0.0058871897
...

But I guess these are just standardized predicted values and not the real values (the real ones should have no negative values!?).

So does anyone know what I have to modify in the code, to get the real values? Any idea? Thank you!

标签: r gam mgcv
2条回答
我欲成王,谁敢阻挡
2楼-- · 2019-03-22 09:21

Not quite sure if I follow you correctly, but predict(model, type = "terms") might be the solution you're looking for.

Update

I don't think these are standardised. Possibly some of the coefficients are just negative.

Consider the example from the help file ?mgcv:::predict.gam:

library(mgcv)
n<-200
sig <- 2
dat <- gamSim(1,n=n,scale=sig)

b<-gam(y~s(x0)+s(I(x1^2))+s(x2)+offset(x3),data=dat)

The results below illustrate that these are in fact the contributions that are being used for each predictor to calculate the fitted values (by calculating the sum of each of these contributions and then adding the intercept and the offset).

> head(predict(b))
        1         2         3         4         5         6 
 9.263322  2.822200  7.137201  4.902631 14.558401 11.889092 
> head(rowSums(predict(b, type = "terms")) + attr(predict(b, type = "terms"), "constant") + dat$x3)
        1         2         3         4         5         6 
 9.263322  2.822200  7.137201  4.902631 14.558401 11.889092 
查看更多
Bombasti
3楼-- · 2019-03-22 09:24

To return predicted values on the same scale of the response you need to set predict(model, type = "response")

The default behaviour of the gam is type = "link" which returns the linear predictor and often with standard errors (thus the positive and negative values you found).

Read more on the ?mgcv::predict.gam help page.

查看更多
登录 后发表回答