I'm trying to interpolate/locally extrapolate some salary data to fill out a data set.
Here's the data set and a plot of the available data:
experience salary
1: 1 21878.67
2: 2 23401.33
3: 3 23705.00
4: 4 24260.00
5: 5 25758.60
6: 6 26763.40
7: 7 27920.00
8: 8 28600.00
9: 9 28820.00
10: 10 32600.00
11: 12 30650.00
12: 14 32600.00
13: 15 32600.00
14: 16 37700.00
15: 17 33380.00
16: 20 36784.33
17: 23 35600.00
18: 25 33590.00
19: 30 32600.00
20: 31 33920.00
21: 35 32600.00
Given the clear nonlinearity, I'm hoping to interpolate & extrapolate (I want to fill in experience for years 0 through 40) via a local linear estimator, so I defaulted to lowess
, which gives this:
This is nice on the plot, but the raw data is missing -- R's plotting device has filled in the blanks for us. I haven't been able to find a predict
method for this function, as it seems R
is moving towards using loess
, which as I understand is a generalization.
However, when I use loess
(setting surface="direct"
to be able to extrapolate, as mentioned in ?loess
), which has a standard predict
method, the fit is less satisfactory:
(There are strong theoretical reasons to say that salary should be non-decreasing--there is some noise/possible mis-measurement driving the U shape here)
And I can't seem to be able to fiddle around with any of the parameters to get back the non-decreasing fit given by lowess
.
Any suggestions for what to do?