I am using the following geoadditive model
library(gamair)
library(mgcv)
data(mack)
mack$log.net.area <- log(mack$net.area)
gm2 <- gam(egg.count ~ s(lon,lat,bs="gp",k=100,m=c(2,10,1)) +
s(I(b.depth^.5)) +
s(c.dist) +
s(temp.20m) +
offset(log.net.area),
data = mack, family = tw, method = "REML")
How can I use it to predict the value of egg.count
at new locations (lon/lat)
where I don't have covariate data, as in kriging
?
For example say I want to predict egg.count
at these new locations
lon lat
1 -3.00 44
4 -2.75 44
7 -2.50 44
10 -2.25 44
13 -2.00 44
16 -1.75 44
but here I don't know the values of the covariates (b.depth
, c.dist
, temp.20m
, log.net.area
).
predict
still requires all variables used in your model to be presented innewdata
, but you can pass in some arbitrary values, like0
s, to those covariates you don't have, then usetype = "terms"
andterms = name_of_the_wanted_smooth_term
to proceed. Useto check what smooth terms are in your model.
I don't normally use
predict.gam
if I want to do prediction for a specific smooth term. The logic ofpredict.gam
is to do prediction for all terms first, that is, the same as your doingtype = "terms"
. Thentype = "link"
, do arowSums
on all term-wise predictions plus an intercept (possibly withoffset
);type = "terms"
, and"terms"
or"exclude"
are unspecified, return the result as it is;type = "terms"
and you have specified"terms"
and / or"exclude"
, some post-process is done to drop terms you don't want and only give you those you want.So,
predict.gam
will always do computation for all terms, even if you just want a single term.Knowing the inefficiency behind this, this is what I will do:
You see, we get the same result.
Some garbage prediction will be made at those garbage values, but
predict.gam
discards them in the end.Code maintenance is, as far as I feel, very difficult for a big package like
mgcv
. The code needs be changed significantly if you want it to suit every user's need. Obviously thepredict.gam
logic as I described here will be inefficient when people, like you, just want it to predict a certain smooth. And in theory if this is the case, variable names checking innewdata
can ignore those terms not wanted by users. But, that requires significant change ofpredict.gam
, and could potentially introduce many bugs due to code changes. Furthermore, you have to submit a changelog to CRAN, and CRAN may just not be happy to see this drastic change.Simon once shared his feelings: there are many people telling me, I should write
mgcv
as this or as that, but I simply can't. Yeah, give some sympathy to a package author / maintainer like him.It will depend if you provide covariates values for
b.depth
,c.dist
,temp.20m
,log.net.area
. But since you don't have them at new locations, the prediction is just to assume these effects to be0
.You are only predicting the spatial field / smooth. In GAM approach the spatial field is modeled as part of mean, not variance-covariance (as in kriging), so I think your use of "residuals" is not correct here.
Correct. You can try
predict.gam
with or withoutterms = "s(lon,lat)"
to help you digest the output. See how it changes when you vary garbage values passed to other covariates.