geom_abline multiple slopes and intercepts

2019-07-13 00:39发布

问题:

Considering this initial data frame (yld_sum):

  coef     pred      se    ci.lb    ci.ub    cr.lb    cr.ub Yld_class
   b0 3164.226 114.256 2940.289 3388.164 2142.724 4185.728      1Low
   b1  -20.698   3.511  -27.580  -13.816  -50.520    9.124      1Low
   b0 3985.287 133.220 3724.180 4246.394 2954.998 5015.576      2Low
   b1  -14.371   4.185  -22.573   -6.168  -44.525   15.784      2Low

How can I simplify my syntax to plot the two estimated regression lines with their respective CI, and obtain the following plot?

This is my verbose code:

library(tidyverse)

yld_sum_est <- yld_sum %>% select(Yld_class, coef, pred) %>% 
  spread(coef, pred)  

yld_sum_low <- yld_sum %>% select(Yld_class, coef, ci.lb) %>% 
  spread(coef, ci.lb)

yld_sum_up <- yld_sum %>% select(Yld_class, coef, ci.ub) %>% 
  spread(coef, ci.ub)

ggplot() + 
  geom_abline(data = yld_sum_est, aes(intercept = b0, slope = b1)) +
  geom_abline(data = yld_sum_low, aes(intercept = b0, slope = b1), linetype= "dashed") +
  geom_abline(data = yld_sum_up, aes(intercept = b0, slope = b1), linetype= "dashed") +
  scale_x_continuous(limits=c(0,60), name="x") +
  scale_y_continuous(limits=c(1000, 4200), name="y") 

回答1:

It is a 'data shape' problem. If you want ggplot to draw multiple objects within a single call, the object parameters (like intercept and slope) need to be the columns of your data frame and the object instances to be the rows by the time the data enters ggplot.

Particularly in your case you need a data frame with 6 rows - one for each line, each holding the identity of the line and it's parameters, like this:

library(tidyverse)

# data block from the question is in clipboard
read.table("clipboard", header = T) -> yld_sum

yld_sum %>%
  select(-starts_with("cr"), -se) %>%
  gather(metric, value, -coef, -Yld_class) %>%
  spread(coef, value) %>% 
  ggplot() +
  geom_abline(aes(
    intercept = b0, 
    slope = b1,
    linetype = if_else(metric == "pred", "", "dashed")),
    ) +
  scale_x_continuous(limits=c(0,60), name="x") +
  scale_y_continuous(limits=c(1000, 4200), name="y") +
  guides(linetype = F)

Feel free to explore the data reshaping process by putting RStudio's View call after successive steps (like %>% View).

The final shape of the data, for illustration (after the spread() call):

  Yld_class metric       b0      b1
1      1Low  ci.lb 2940.289 -27.580
2      1Low  ci.ub 3388.164 -13.816
3      1Low   pred 3164.226 -20.698
4      2Low  ci.lb 3724.180 -22.573
5      2Low  ci.ub 4246.394  -6.168
6      2Low   pred 3985.287 -14.371