r - loess prediction returns NA

2020-07-06 02:46发布

问题:

I am struggling with "out-of-sample" prediction using loess. I get NA values for new x that are outside the original sample. Can I get these predictions?

x <- c(24,36,48,60,84,120,180)
y <- c(3.94,4.03,4.29,4.30,4.63,4.86,5.02)
lo <- loess(y~x)
x.all <- seq(3,200,3)
predict(object = lo,newdata = x.all)

I need to model full yield curve, i.e. interest rates for different maturities.

回答1:

From the manual page of predict.loess:

When the fit was made using surface = "interpolate" (the default), predict.loess will not extrapolate – so points outside an axis-aligned hypercube enclosing the original data will have missing (NA) predictions and standard errors

If you change the surface parameter to "direct" you can extrapolate values.

For instance, this will work (on a side note: after plotting the prediction, my feeling is that you should increase the span parameter in the loess call a little bit):

lo <- loess(y~x, control=loess.control(surface="direct"))
predict(lo, newdata=x.all)


回答2:

In addition to nico's answer: I would suggest to fit a gam (which uses penalized regression splines) instead. However, extrapolation is not advisable if you don't have a model based on science.

x <- c(24,36,48,60,84,120,180)
y <- c(3.94,4.03,4.29,4.30,4.63,4.86,5.02)
lo <- loess(y~x, control=loess.control(surface = "direct"))
plot(x.all <- seq(3,200,3),
     predict(object = lo,newdata = x.all),
     type="l", col="blue")
points(x, y)

library(mgcv)
fit <- gam(y ~ s(x, bs="cr", k=7, fx =FALSE), data = data.frame(x, y))
summary(fit)

lines(x.all, predict(fit, newdata = data.frame(x = x.all)), col="green")