ggplot2 : multiple factors boxplot with scale_x_da

2020-07-18 09:55发布

问题:

I would like to create a multivariate boxplot time serie with ggplot2 and i need to have an x axis with boxplot position function of dates.

I created for that an interaction matrix with the combination of the factors Treatment x Date that is plotted against NDVI and with different trial groups:

here you can find some minimal data :

dat<-"Treatment  Trial.group    Date    NDVI
HighN   A   14/06/2013  0.27522123
HighN   A   14/06/2013  0.259781926
HighN   A   14/06/2013  0.175982276
LowN    A   14/06/2013  0.193604644
LowN    A   14/06/2013  0.261191793
LowN    A   14/06/2013  0.273672853
HighN   B   14/06/2013  0.192144884
HighN   B   14/06/2013  0.283013594
HighN   B   14/06/2013  0.230556973
LowN    B   14/06/2013  0.233952974
LowN    B   14/06/2013  0.261718465
LowN    B   14/06/2013  0.216450145
HighN   A   22/06/2013  0.37522123
HighN   A   22/06/2013  0.359781926
HighN   A   22/06/2013  0.275982276
LowN    A   22/06/2013  0.293604644
LowN    A   22/06/2013  0.361191793
LowN    A   22/06/2013  0.373672853
HighN   B   22/06/2013  0.292144884
HighN   B   22/06/2013  0.383013594
HighN   B   22/06/2013  0.330556973
LowN    B   22/06/2013  0.333952974
LowN    B   22/06/2013  0.361718465
LowN    B   22/06/2013  0.316450145
HighN   A   24/06/2013  0.47522123
HighN   A   24/06/2013  0.459781926
HighN   A   24/06/2013  0.375982276
LowN    A   24/06/2013  0.393604644
LowN    A   24/06/2013  0.461191793
LowN    A   24/06/2013  0.473672853
HighN   B   24/06/2013  0.392144884
HighN   B   24/06/2013  0.483013594
HighN   B   24/06/2013  0.430556973
LowN    B   24/06/2013  0.433952974
LowN    B   24/06/2013  0.461718465
LowN    B   24/06/2013  0.416450145"

Here is the code to import and create the plot :

NDVI_ts <- read.table(text=dat, header = TRUE)
library(ggplot2)
library(scales)
interact<-interaction(NDVI_ts$Treatment, NDVI_ts$Date, sep=" : ")
ggplot(data=NDVI_ts, aes(x=interact, y=NDVI)) + 
geom_boxplot(aes(fill = Trial.group), width = 0.6) + 
theme_bw() + theme(axis.text.x = element_text(angle = 90, hjust = 1)) 

This code give me the following boxplot, which is fine, but the x-axis is not linked to dates : (NDVI ~ Treatment + Date + Trial.group)

I know that I can normally do that with something like this :

q + scale_x_date(breaks="1 week", labels=date_format("%d-%b"))

but the interact matrix is a factor and cannot be defined as a time object, so that doesn't work. I have the following error :

Error: Invalid input: date_trans works with objects of class Date only

How could I have multivariate boxplot positions defined by dates?

The NDVI_ts$Date is already defined as a date-object in R.

回答1:

Creating an x axis with the interaction between 'Treatment' and 'Date' may facilitate the arrangement of boxes of different value of the grouping variables. However, as you noticed, when the original Date axis is converted to a 'composite' factor, it is much harder to control appearance of the axis.

Here is an alternative which keeps the x axis in Date format. The two levels of 'Treatment' are distinguished by creating two different colours palettes. Groups within 'Treatment' are separated by different shades of the colour. Boxes are grouped by using the group argument.

library(ggplot2)
library(scales)
library(RColorBrewer)

# convert Date to class 'Date'
NDVI_ts$date <- as.Date(NDVI_ts$Date, format = "%d/%m/%Y")

# A possible way to create suitable colours for the boxes
# create one palette of colours for each level of Treatment
# e.g. blue colour for 'HighN', red for 'LowN'
# one colour for each level of Trial.group

# number of levels of Trial.group
n_col <- length(unique(NDVI_ts$Trial.group))

# create blue colours
blues <- brewer.pal(n = n_col, "Blues")
# Warning message:
#   In brewer.pal(n = n_col, "Blues") :
#   minimal value for n is 3, returning requested palette with 3 different levels

# create red
reds <- brewer.pal(n = n_col, "Reds")

# Here I manually pick the first and the last 'blue' and 'red'
# From the plot in the question, it seems like you have more than two levels of Trial.group
# so you should be able to use the 'blues' and 'reds' vectors in scale_fill_manual.

# group boxes by date, Trial.group and Treatment
ggplot(data = NDVI_ts, aes(x = date, y = NDVI)) +
  geom_boxplot(aes(fill = interaction(Trial.group, Treatment),
                   group = interaction(factor(date), Trial.group, Treatment))) + 
  theme_bw() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  scale_x_date(breaks = "1 week", labels = date_format("%d-%b")) +
  scale_fill_manual(name = "Treatment",
                    values = c("#FEE0D2", "#DE2D26", "#DEEBF7", "#3182BD"))
  #  scale_fill_manual(values = c(reds, blues))