I want to fit a piecewise linear regression with one break point xt
, such that for x < xt
we have a quadratic polynomial and for x >= xt
we have a straight line. Two pieces should join smoothly, with continuity up to 1st derivative at xt
. Here's picture of what it may look like:
I have parametrize my piecewise regression function as:
where a
, b
, c
and xt
are parameters to be estimated.
I want to compare this model with a quadratic polynomial regression over the whole range in terms of adjusted R-squared.
Here is my data:
y <- c(1, 0.59, 0.15, 0.078, 0.02, 0.0047, 0.0019, 1, 0.56, 0.13,
0.025, 0.0051, 0.0016, 0.00091, 1, 0.61, 0.12, 0.026, 0.0067,
0.00085, 4e-04)
x <- c(0, 5.53, 12.92, 16.61, 20.3, 23.07, 24.92, 0, 5.53, 12.92,
16.61, 20.3, 23.07, 24.92, 0, 5.53, 12.92, 16.61, 20.3, 23.07,
24.92)
My attempt goes as follows, for a known xt
:
z <- pmax(0, x - xt)
x1 <- pmin(x, xt)
fit <- lm(y ~ x1 + I(x1 ^ 2) + z - 1)
But the straight line does not appear to be tangent to the quadratic polynomial at xt
. Where am I doing wrong?
Similar questions:
In this section I will be demonstrating a reproducible example. Please make sure you have sourced functions defined in the other answer.
Now, assume we don't know
c
, and we would like to search on a evenly spaced grid:RSS
has chosen 0.55. This is slightly different from the true value0.6
, but from the plot we see thatRSS
curve does not vary much between[0.5, 0.6]
so I am happy with this.The resulting model
fit
contains rich information:We can extract the summary table for coefficients:
Finally, we want to see some prediction plot.
We can make a plot:
李哲源 is a genius but I would like to suggest another solution, using the Heaviside (unit step) function, H(x) = 1 if x>0; H = 0 if x ≤ 0
Then, the function to fit is f(x,c) = b0 + b1 (x-c) + b2 (x-c)^2 H(c-x), and can be used with nls:
Testing it with the 李哲源's toy example, gives
This is an excellent exercise (maybe hard) to digest the theory and implementation of linear models. My answer will contain two parts:
I have to use a different parametrization because the one you gave in your question is wrong! Your parametrization only ensures continuity of function value, but not the first derivative! That is why your fitted line is not tangent to the fitted quadratic polynomial at
xt
.Function
est
below wraps up.lm.fit
(for maximum efficiency) for estimation and inference of a model, at a givenc
:As you can see, it also returns various summary as if
summary.lm
has been called. Now let's write another wrapper functionchoose.c
. It sketchRSS
againstc.grid
and return the best model with selectedc
.So far so good. To complete the story, we also want a
predict
routine.