Cannot place count label at boxplot whisker with o

2019-05-21 12:03发布

I am trying to place labels of observations counts at the ends of boxplot whiskers, but it doesn't seem to work when there are outliers.

I have attempted to compare the max/min values with what I believe is the calculated whisker length [quartile 1 (or quartile 3) + (or -) 1.5 * interquartile range]. But the labels get placed at neither the max/min value or the whisker end.

Example using mtcars and y-axis reversed to demonstrate:

library(ggplot2,dplyr)

  mtcars %>%
    select(qsec, cyl,am) %>%

    ggplot(aes(factor(cyl),qsec,fill=factor(am))) + 
    stat_boxplot(geom = "errorbar") + ## Draw horizontal lines across ends of whiskers
    geom_boxplot(outlier.shape=1, outlier.size=3, 
                 position =  position_dodge(width = 0.75)) +
    scale_y_reverse() +
    geom_text(data = mtcars %>%
                select(qsec,cyl,am) %>%
                group_by(cyl, am) %>%
                summarize(min_qsec = min(qsec),Count = n(),med = median(qsec),
                          q1 = quantile(qsec,0.25), 
                          q3 = quantile(qsec,0.75), iqr = IQR(qsec),
                          qsec = mean(qsec),
                          lab_pos = max(min_qsec, q1-1.5*iqr)),
              aes(y=lab_pos,label = Count), position = position_dodge(width = 0.75))

Which produces:

enter image description here

The labels for am(1) at cyl(4) and am(0) at cyl(8) are misaligned.

Is my calculation for lab_pos incorrect or is there a better approach to position labels at the whisker ends, regardless of outliers? I would like to accomplish it using ggplot2 and dplyr, if possible

标签: r ggplot2 dplyr
1条回答
够拽才男人
2楼-- · 2019-05-21 12:17

If I understand correctly, this is what you want:

label_data <- mtcars %>%
  select(qsec, cyl, am) %>%
  group_by(cyl, am) %>%
  summarize(min_qsec = min(qsec),
            Count = n(),
            med = median(qsec),
            q1 = quantile(qsec, 0.25), 
            q3 = quantile(qsec, 0.75),
            iqr = IQR(qsec),
            lab_pos = min(ifelse(qsec > q1-1.5*iqr, qsec, NA), na.rm = TRUE),
            qsec = mean(qsec))

mtcars %>%
  select(qsec, cyl,am) %>%
  ggplot(aes(factor(cyl),qsec,fill=factor(am))) + 
  stat_boxplot(geom = "errorbar") + ## Draw horizontal lines across ends of whiskers
  geom_boxplot(outlier.shape=1, outlier.size=3, 
               position =  position_dodge(width = 0.75)) +
  scale_y_reverse() +
  geom_text(data = label_data, aes(y = lab_pos,label = Count),
            position = position_dodge(width = 0.75), vjust = 0, fontface = "bold")

enter image description here

The whiskers extend to the furthest point within the fence, not the fence itself.

查看更多
登录 后发表回答