Working with span in ggplot2 / geom_smooth

2019-09-19 16:51发布

问题:

I am using ggplot2 to create a plot with several data sets. For not all datasets have the same amount of datapoints (or have breaks), I would like to adjust the span.

But I am not sure what effects the adjustment of span have, it is neither documented in stat_smooth nor in geom_smooth, any idea where i can find anything how span is taking data from the dataset? How does span calculate the number of datapoints which have to be taken for calculating the smoother? Code looks like this:

t<-ggplot(data=XX1)+
scale_x_date(as.POSIXct(XX1$date1), breaks = "1 month", labels=date_format("%b %Y"))+
geom_vline(xintercept=as.numeric(XX2$Day.of.action, colour="lightgray"))+
geom_point(aes(x=day, y=perc_DP10m, colour=as.factor(station_subunit) ))+
geom_smooth(data=1_F1, aes(x=day, y=perc_DP10m, 
 colour=as.factor(station_subunit)),method=loess, span=0.3, se=FALSE, lwd=1)+
geom_smooth(data=XX1, aes(x=day, y=perc_DP10m,   
 colour=as.factor(station_subunit)),method=loess, span=0.3, se=FALSE, lwd=1)+
geom_smooth(data=1_F3, aes(x=day, y=perc_DP10m,  
 colour=as.factor(station_subunit)),method=loess, span=0.3, se=FALSE, lwd=1)+
geom_smooth(data=1_F4, aes(x=day, y=perc_DP10m, 
 colour=as.factor(station_subunit)),method=loess, span=0.3, se=FALSE, lwd=1)+
geom_smooth(data=1_F5, aes(x=day, y=perc_DP10m, 
 colour=as.factor(station_subunit)),method=loess, span=0.3, se=FALSE, lwd=1)+
geom_smooth(data=1_F6, aes(x=day, y=perc_DP10m, 
 colour=as.factor(station_subunit)),method=loess, span=0.3, se=FALSE, lwd=1)+
geom_smooth(data=1_F7, aes(x=day, y=perc_DP10m, 
 colour=as.factor(station_subunit)),method=loess, span=0.3, se=FALSE, lwd=1)+
geom_smooth(data=1_F8, aes(x=day, y=perc_DP10m, 
 colour=as.factor(station_subunit)),method=loess, span=0.3, se=FALSE, lwd=1)+
geom_smooth(data=1_F9, aes(x=day, y=perc_DP10m, 
 colour=as.factor(station_subunit)),method=loess, span=0.3, se=FALSE, lwd=1)+
geom_smooth(data=1_F10, aes(x=day, y=perc_DP10m, 
 colour=as.factor(station_subunit)),method=loess, span=0.3, se=FALSE, lwd=1)+
geom_smooth(data=1_F11, aes(x=day, y=perc_DP10m, 
 colour=as.factor(station_subunit)),method=loess, span=0.3, se=FALSE, lwd=1)+
theme_bw()
t<-t+labs( list( title = "Detection Positive Ten Minutes / Day \n",
             x = "\n Year (August 2012 - March 2014",
             y = "% DP10M per Day \n"))
t

any hint is much appreciated!

回答1:

Please make a short reproductible example.

The default stat for this geom is stat_smooth

After reading the stat_smooth help (?stat_smooth), the function use statistical methods from lm, glmor loess functions from the stats base package. There is alos a a reference to the mgcv package for the gam method. So, the span argument of stat_smooth use these methods to control the degree of smoothing.

But the easy way to verify that is to use the loessfunction of the stats package and to compare with your results obtained with stat_smooth.

With this this example the results semm to be the same:

loess:

period <- 120
x <- 1:120
y <- sin(2*pi*x/period) + runif(length(x),-1,1)
plot(x,y, main="Sine Curve + 'Uniform' Noise")
y.loess <- loess(y ~ x, span=0.75, data.frame(x=x, y=y))
y.predict <- predict(y.loess, data.frame(x=x))
lines(x,y.predict)

geom_smooth:

xy <- cbind(x,y)
gp <- ggplot(as.data.frame(xy), aes(x=x,y=y)) + geom_point()
gp + geom_smooth(aes(y=y,x=x), data=as.data.frame(xy), method = "loess", span = 0.75)


标签: html r ggplot2