Coloring ggplot histogram by precise cut off point

2019-07-26 11:24发布

问题:

I would like to color a ggplot histogram by different vertical cutoff points. I was able to use this answer, but found that on my data the bins are split up and shortened. Minimal example and chart below.

How can I split up the bins vertically without getting these chopped up shorter bins?

library(tidyverse)

set.seed(42)

# define cutoffs
cutoff_1 <- -21
cutoff_2 <- 60

df <- data.frame(rand = rnorm(10000)*100) %>% 
  mutate(colors = case_when(
    rand < cutoff_1 ~ "red",
    rand >= cutoff_1 & rand <= cutoff_2 ~ "blue",
    rand > cutoff_2 ~ "green"
    )
  )
n.bins <- 20 # number of bins
additional.cutoffs <- c(cutoff_1, cutoff_2) # additional bins

bins <- seq(min(df$rand), max(df$rand), length.out = n.bins)    
bins <- c(bins, additional.cutoffs) %>% sort()

df %>% 
  ggplot(aes(x=rand, fill=colors)) +
  geom_histogram(breaks=bins) +
  geom_vline(xintercept=c(cutoff_1, cutoff_2), colour="black") 

回答1:

One way I could think of is to make cut off as a boundary of equal sized bins. One way to do so is:

# decide bin width (I decided to have two bins in the middle)
binwidth <- (cutoff_2 - cutoff_1)/2 
# create a possible bins (stating from the cut off and make sure that it covers the domain
bins <- -21 + (-15:15) * binwidth 
# limit the range of possible bins based on the range of the data
bins <- bins[between(bins, min(df$rand) - binwidth, max(df$rand) + binwidth)]

df %>% 
  ggplot(aes(x=rand, fill=colors)) +
  geom_histogram(breaks=bins) +
  geom_vline(xintercept=c(cutoff_1, cutoff_2), colour="black") + theme_minimal()

Note

But I may say that doing something like this looks a more natural way of presenting the data.

Fill different colors for each quantile in geom_density() of ggplot



标签: r ggplot2