I have often wondered if you can get ggplot
to do on-the-fly calculations by the facet groups of the plot in a similar way that they would be done using dplyr::group_by
. So in the example below is it possible to calculate the cumsum for each different category, rather than the overall cumsum without altering df
first?
library(ggplot2)
df <- data.frame(X = rep(1:20,2), Y = runif(40), category = rep(c("A","B"), each = 20))
ggplot(df, aes(x = X, y = cumsum(Y), colour = category))+geom_line()
I can obviously do an easy workaround using dplyr
, however as I do this frequently I was keen to know if there is a way to prevent having to specify the grouping variables multiple times (here in group_by
and aes(colour = …)
.
Working alternative, but not what I'm asking for in this case
library(dplyr)
library(ggplot2)
df %>% group_by(category) %>% mutate(Ysum = cumsum(Y)) %>%
ggplot(aes(x = X, y = Ysum, colour = category))+geom_line()
Edit: (To answer in response to the @42- comment) I am mainly asking out of curiosity if this is possible, not because the alternative doesn't work. I also think it would be neater in my code if I am making a number of plots which are summing (or other similar calculations) different variables based on different columns or in different datasets, rather than continuously having to group, mutate then plot. I could write a function to do it for me but I thought it might be inbuilt functionality that I missing (the ggplot help doesn't go into the real details).