I am trying to apply loess smoothing to a scatterplot (i.e. between two quantitative variables). I would like to plot where the loess smoothing occurs in the scatterplot, and then would like to extract only the data points in the scatterplot that are above that smoothing.
For instance, if this is my scatterplot:
qplot(mpg, cyl, data=mtcars)
And I wanted to superimpose the smoother:
qplot(mpg, wt, data=mtcars) + with(mtcars, loess.smooth(mpg, wt))
This results in the error: "Don't know how to add o to a plot".
Then, assuming I could get that superimposition to work, I would like to extract only the cars that are above that line.
[Disclaimer: this answer is incomplete]
ggplot2
has a function for adding a loess smoother: stat_smooth()
, e.g.
qplot(mpg, cyl, data=mtcars) + stat_smooth()
# For datasets with n < 1000 default is loess, to hard-code:
qplot(mpg, cyl, data=mtcars) + stat_smooth(method="loess")
The function help page also states it returns a data.frame
with predictions, which you can use to extract points. This SO answer goes through it. Unfortunately it splits it into typically 80 points, which might not align with the data, so you'll have to do some interpolation to get the points above/below.
P.S. this is kind of two questions - I'd recommend splitting them in the future.