I'm plotting some data and have the following code:
ggplot(aes(x = x, y = y), data = data) +
geom_point(alpha = 1/15, color = 'blue')+
scale_y_continuous('y')+
scale_x_continuous('x')+
geom_smooth(stat = 'smooth', color = 'Red')
The graph looks like this:
But if I specify 'gam' in the the geom_smooth
function, like:
geom_smooth(stat = 'smooth', color = 'Red', method = 'gam')
I get a different result:
Why is this happening?
In the Documentation, you can see that:
smoothing method (function) to use, eg. "lm", "glm", "gam", "loess",
"rlm".
For method = "auto" the smoothing method is chosen based on the size
of the largest group (across all panels). loess is used for less than
1,000 observations; otherwise gam is used with formula = y ~ s(x, bs =
"cs"). Somewhat anecdotally, loess gives a better appearance, but is
O(n^2) in memory, so does not work for larger datasets.
Note that when method 'auto' uses gam, it also changes the formula. The default formula is
formula = y ~ x
So in the first scenario, it uses method gam, with the modified function function y ~ s(x, bs = "cs"). The second time, you only specify that method 'gam' should be used, but you don't overwrite the formula, so y ~x is still used. You could do this:
geom_smooth(stat = 'smooth', color = 'Red', method = 'gam', formula = y ~ s(x, bs = "cs"))
To get the same result. Hope this helps!