I have the following data in a data.frame called t
DayNum MeanVolume StdDev StdErr
1 13 207.0500 41.00045 5.125057
2 15 142.7625 27.87236 3.484045
3 18 77.5500 19.43928 2.429910
4 21 66.3750 20.56403 2.570504
5 26 67.0500 29.01576 3.626970
6 29 66.4750 25.94537 3.243171
7 33 76.9625 25.31374 3.164218
8 36 91.2875 37.01719 4.627149
9 40 102.0500 29.39898 3.674872
10 43 100.8250 24.22830 3.028538
11 47 120.5125 28.80592 3.600740
12 50 147.8875 35.82894 4.478617
13 54 126.7875 45.43204 5.679004
14 57 139.8500 56.01117 7.001397
15 60 179.1375 69.64526 8.705658
16 64 149.7625 39.10265 4.887831
17 68 229.5250 121.08411 15.135514
18 71 236.5125 76.23146 9.528933
19 75 243.2750 101.69474 12.711842
20 78 331.6750 141.25344 17.656680
21 82 348.2875 122.86359 15.357948
22 85 353.7750 187.24641 23.405801
23 89 385.4000 154.05826 19.257283
24 92 500.9875 263.43714 32.929642
25 95 570.2250 301.82686 37.728358
26 98 692.2250 344.71226 43.089032
27 102 692.8000 283.94120 35.492650
28 105 759.2000 399.19323 49.899153
29 109 898.2375 444.94289 55.617861
30 112 920.1000 515.79597 64.474496
I am trying to fit x = DayNum to y = MeanVolume in t.
Here is what I did:
Fit to data
model<-lm(log(t$MeanVolume) ~ t$DayNum, data=t)
Plot data
plot(MeanVolume~DayNum, data=t, ylab="Mean Volume (mm3)", xlim=c(0,120), ylim=c(0,1000))
arrows(t$DayNum, t$MeanVolume-t$StdErr, t$DayNum, t$MeanVolume+t$StdErr, length=0.01, angle=90, code=3)
Create fit data
t$pred<-exp(predict(model))
Plot fit
lines(t$DayNum,t$pred,col="blue")
On the other hand, if I use ggplot2 to do this by using
ggplot(data = t, mapping = aes(x = DayNum, y=MeanVolume)) +
geom_line() +
geom_point(size=3, color="blue") +
geom_smooth(method="glm", method.args=list(family=gaussian(link="log"))) +
labs(x="Days", y="Mean Volume (mm3)", title="Data") +
geom_errorbar(aes(ymin = MeanVolume - StdErr, ymax = MeanVolume + StdErr), width=.2)
I get the following plot
As you can see the fitted curve in the ggplot case is better than in the plot case. Why? Also I would like to fit parameters such as intercept and the slope of the exponential fit line. How can I extract them from ggplot call?
lm with log transformed y is not the same as glm with gaussian error distribution and log link (as to why check link in the comment by @Lyngbakr)
as for the second part of the question:
to extract the data from a ggplot one can use:
the data for the curve are in
build$data[[3]]
This data is the same as data in
pred_glm
- well its a bit more dense (more data points). As far as I am aware there is no method to extract the coefficients from the ggplot just the predictions, but you can always build the glm model as described above.