ggplot2: Add secondary x label (year below months)

2019-08-03 03:50发布

问题:

My Problem is related to: Axis labels on two lines with nested x variables (year below months)

However, my data looks a little different.

library(dplyr)

set.seed(122)
df <- as_tibble(rlnorm(1260, meanlog = 0.06, sdlog = 0.20))

df$month <- rep(c("Jan", "Feb", "Mär", "Apr", "Mai", "Jun", 
      "Jul", "Aug", "Sep", "Okt", "Nov", "Dez"), 5, each=21)

df$year <- rep(c("Year 1", "Year 2", "Year 3", "Year 4", "Year 5" ), 1, each=252)

I would like my line graph too look like this, but without the vertical line if possible:

回答1:

library(tidyverse)

#data:
set.seed(122)
df <- as_tibble(rlnorm(1260, meanlog = 0.06, sdlog = 0.20))
#> Warning: Calling `as_tibble()` on a vector is discouraged, 
#> because the behavior is likely to change in the future. 
#> Use `tibble::enframe(name = NULL)` instead.

df$month <- rep(c("Jan", "Feb", "Mär", "Apr", "Mai", "Jun", 
                  "Jul", "Aug", "Sep", "Okt", "Nov", "Dez"), 5, each=21)

df$year <- rep(c("Year 1", "Year 2", "Year 3", "Year 4", "Year 5" ), 1, each=252)

#solution:
month_lab <- rep(unique(df$month), length(unique(df$year)))

year_lab <- unique(df$year)

df %>%
  as.data.frame() %>%
  rename(price = 1) %>% 
  mutate(rnames = rownames(.)) %>% 
  ggplot(aes(x = as.numeric(rnames), y = price, 
             group = year)) +
  geom_line() +
  labs(title = "Stock Price Chart", y = "Price", x = "date") +
  scale_x_continuous(breaks = seq(1, 1260, by = 21), 
                     labels = month_lab, expand = c(0,0)) +
  facet_grid(~year, space="free_x", scales="free_x", switch="x") +
  theme(strip.placement = "outside",
        strip.background = element_rect(fill=NA,colour="grey50"),
        panel.spacing=unit(0,"cm"))

Created on 2019-05-28 by the reprex package (v0.3.0)



回答2:

I can think of two ways to do this, each with their pros and cons:

Data prep:

library(dplyr)
library(tibble)
library(lubridate)
library(scales)
library(ggplot2)

set.seed(122)
df <- as_tibble(rlnorm(1260, meanlog = 0.06, sdlog = 0.20))
df$month <- rep(month.abb, 5, each=21)
df$year <- rep(c("Year 1", "Year 2", "Year 3", "Year 4", "Year 5"), 1, each=252)

# We first create a "real" date variable with year, month and day. I've chosen to add 
# "201" in from of your year, but it really doesn't matter in our case.
df <- df %>%
  group_by(year, month) %>%
  mutate(Date = as.Date(paste0("201", sub("^.+(\\d+)$", "\\1", year),
                               "-", month, "-", row_number()),
                        format = "%Y-%b-%d"))

# Since OP's daily values don't make up full months of data, 
# we need this step to show missing data correctly. 
df <- expand.grid(Date = seq.Date(from = min(df$Date), to = max(df$Date), by = "days")) %>% 
  mutate(year = paste("Year", sub("^\\d{3}(\\d)", "\\1", format(Date, "%Y"))),
         month = format(Date, "%b")) %>%
  left_join(df)

Note that I have used month.abb to replace the months provided by OP, since it looks like they are using a non-English locale.

1. Use facet_grid:

ggplot(df, aes(x = Date, y = value, group = year)) +
  geom_line() +
  facet_grid(. ~ year, scale = "free_x") +
  scale_x_date(labels = date_format("%b"), expand = c(0, 0)) +
  theme(panel.spacing.x = unit(0, "lines")) +
  ylim(c(0, 2.5))

I've used expand in scale_x_date to prevent ggplot from adding spaces on both ends of each facet and panel.spacing.x to reduce the spacing between facets. The combination of these two gives us an illusion that the panels are connected, but they are not (the end of each facet does not connect to the beginning of the next even if there are no missing values)

2. Use geom_rect + geom_text:

# Create labels
label_range <- df %>%
  group_by(year) %>%
  summarize(xmin = min(Date),
            xmax = max(Date),
            ymin = -0.5,
            ymax = ymin + 0.15)

ggplot(df) +
  geom_line(aes(x = Date, y = value)) +
  geom_rect(data = label_range, fill = "lightcoral", color = "#f2f2f2",
            aes(xmin = xmin, xmax = xmax, 
                ymin = ymin, ymax = ymax,
                group = year)) +
  geom_text(data = label_range,
            aes(x = xmin + 365/2, y = ymin + 0.1,
                group = year, label = year)) +
  coord_cartesian(ylim = c(0, 2.5), clip = "off") +
  scale_x_date(labels = date_format("%b"), 
               date_breaks = "1 month",
               expand = c(0.01, 0.01)) +
  theme_bw() +
  theme(plot.margin = unit(c(1,1,3,1), "lines"))

This second method is more tedious, but our data will be treated as one continuous series.

  1. Create label_range to determine the coordinates of the four corners of each geom_rect.

  2. Using this dataset, I plotted the "facet boxes" using geom_rect and their labels using geom_text grouped by year.

  3. We want the rectangles to be below the x-axis, so I used coord_cartesian to set the plot at a specific zoom, which prevents our rects from clipping off when we set it outside the plot.

  4. plot.margin adds some spaces below the x-axis for our facet labels

  5. Notice the gaps between Dec and Jan. They are caused by missing values, which is different than the gaps between Dec and Jan in the first method.