I have this data set:
x <- c(0, 40, 80, 120, 160, 200)
y <- c(6.52, 5.10, 4.43, 3.99, 3.75, 3.60)
I calculated a linear model using lm()
:
model <- lm(y ~ x)
I want know the predicted values of x
if I have new y
values, e.g. ynew <- c(5.5, 4.5, 3.5)
, but if I use the predict()
function, it calculates only new y
values.
How can I predict new x
values if I have new y
values?
Since this is a typical problem in chemistry (predict values from a calibration), package chemCal
provides inverse.predict
. However, this function is limited to "univariate model object[s] of class lm or rlm with model formula y ~ x or y ~ x - 1."
x <- c(0, 40, 80, 120, 160, 200)
y <- c(6.52, 5.10, 4.43, 3.99, 3.75, 3.60)
plot(x,y)
model <- lm(y ~ x)
abline(model)
require(chemCal)
ynew <- c(5.5, 4.5, 3.5)
xpred<-t(sapply(ynew,function(y) inverse.predict(model,y)[1:2]))
# Prediction Standard Error
#[1,] 31.43007 -38.97289
#[2,] 104.7669 -36.45131
#[3,] 178.1037 -39.69539
points(xpred[,1],ynew,col="red")
Warning: This function is quite slow and not suitable, if you need to inverse.predict a large number of values.
If I remember correctly, the neg. SEs occur because the function expects the slope to be always positive. Absolute values of SE should still be correct.
I think you just have to use the algebra to invert y=a+b*x
to x=(y-a)/b
:
cc <- coef(model)
(xnew <- (ynew-cc[1])/cc[2])
# [1] 31.43007 104.76689 178.10372
plot(x,y
abline(model)
points(xnew,ynew,col=2)
Looking at your 'data' here, I think a nonlinear regression might be better ...
If your relationship is nonmonotone or if you have multiple predictor values then there can be multiple x-values for a given y-value and you need to decide how to deal with that.
One option that could be slow (and may be the method used in the other packages mentioned) is to use the uniroot function:
x <- runif(100, min=-1,max=2)
y <- exp(x) + rnorm(100,0,0.2)
fit <- lm( y ~ poly(x,3), x=TRUE )
(tmp <- uniroot( function(x) predict(fit, data.frame(x=x)) - 4, c(-1, 2) )$root)
library(TeachingDemos)
plot(x,y)
Predict.Plot(fit, 'x', data=data.frame(x=x), add=TRUE, ref.val=tmp)
You could use the TkPredict
function from the TeachingDemos
package to eyeball a solution.
Or you could get a fairly quick approximation by generating a lot of predicted points, then feeding them to the approxfun
or splinfun
functions to produce the approximations:
tmpx <- seq(min(x), max(x), length.out=250)
tmpy <- predict(fit, data.frame(x=tmpx) )
tmpfun <- splinefun( tmpy, tmpx )
tmpfun(4)