How to plot confidence bands for my weighted log-l

2019-07-18 16:28发布

问题:

I need to plot an exponential species-area relationship using the exponential form of a weighted log-log linear model, where mean species number per location/Bank (sb$NoSpec.mean) is weighted by the variance in species number per year (sb$NoSpec.var).

I am able to plot the fit, but have issues figuring out how to plot the confidence intervals around this fit. The following is the best I have come up with so far. Any advice for me?

# Data
df <- read.csv("YearlySpeciesCount_SizeGroups.csv")
require(doBy)
sb <- summaryBy(NoSpec ~ Short + Area + Regime + SizeGrp, df, 
                FUN=c(mean,var, length))

# Plot to fill
plot(S ~ A, xlab = "Bank Area (km2)", type = "n", ylab = "Species count",
     ylim = c(min(S), max(S)))
text(A, S, label = Pisc$Short, col = 'black')

# The Arrhenius model
require(vegan)
gg <- data.frame(S=S, A=A, W=W)
mloglog <- lm(log(S) ~ log(A), weights = 1 / (log10(W + 1)), data = gg)

# Add exponential fit to plot (this works well)
lines(xtmp, exp(predict(mloglog, newdata = data.frame(A = xtmp))),
      lty=1, lwd=2)

Now I want to add confidence bands... This is where I'm finding issues...

## predict using original model.. get standard errors
pp<-data.frame(A = xtmp)
p <- predict(mloglog, newdata = pp, se.fit = TRUE)
pp$fit <- p$fit
pp$se <- p$se.fit

## Calculate lower and upper bounds for each estimate using standard error * 1.96
pp$upr95 <- pp$fit + (1.96 * pp$se)
pp$lwr95 <- pp$fit - (1.96 * pp$se)

But I am not sure whether the following is correct. I couldn't find any answers that didn't involve ggplot when searching google / stack overflow / cross validated.

## Create new linear models to create a fitted line given upper and lower bounds?
upr <- lm(log(upr95) ~ log(A), data=pp)
lwr <- lm(log(lwr95) ~ log(A), data=pp)
lines(xtmp, exp(predict(upr, newdata=pp)), lty=2, lwd=1)
lines(xtmp, exp(predict(lwr, newdata=pp)), lty=2, lwd=1)

Thanks in advance for any help!

回答1:

It is OK for this question to be without data provided, because:

  • OP's code is said to be working fine so there is nothing "not working";
  • this question is more related to statistical procedure: what is the right thing to do.

I would make a brief answer, as I saw you added "solved" to question title in your last update. Note it is not recommended to add such key word to question title. If something is solved, use an answer.


Strictly speaking, using 1.96 is incorrect. You can read How does predict.lm() compute confidence interval and prediction interval? for details. We need residual degree of freedom and 0.025 quantile of t-distribution.

What I want to say, is that predict.lm can return confidence interval for you:

pp <- data.frame(A = xtmp)
p <- predict(mloglog, newdata = pp, interval = "confidence")

p will be a three-column matrix, with "fit", "lwr" and "upr".

Since you fitted a log-log model, both fitted values and confidence interval need be back transformed. Simply take exp on this matrix p:

p <- exp(p)

Now you can easily use matplot to produce nice regression plot:

matplot(xtmp, p, type = "l", col = c(1, 2, 2), lty = c(1, 2, 2))