My problem is being able to displaying 4 categorical variables in a bar graph in R.
The 4 categorical variables each have 2 or more levels. My thoughts were to use a ggplot
to create separate bar plots using geom_bar
for each of 3 categories, for which counts of each level would be stacked. I would then use facet_wrap
then to split it out by the 4th category.
The data looks like this:
Species Crown_Class Life_class Stem_Category
E. obliqua Suppressed Standing live Large stems
E. rubida Intermediate Standing live Large stems
E. obliqua Suppressed Standing live Small stems
E. obliqua Suppressed Standing live Small stems
E. rubida Suppressed Standing live Large stems
E. radiata Suppressed Standing live Small stems
E. obliqua Dominant Standing live Small stems
E. obliqua Suppressed Standing live Small stems
E. radiata Suppressed Standing live Large stems
E. rubida NA Standing dead Large stems
E. rubida Intermediate Standing live Large stems
The graph I have in mind shows each a stacked bar for each of three categories which are then grouped by a third. For the data given, separate bars for Crown_Class, life_class and Stem_Category would be displayed for each of the species.
I have been trying for hours and can do separate plots using this code (I separated the data into 3 separate dataframes to do it though:
ggplot(data= cc, aes(x= Species, fill = Crown_Class))+
geom_bar(position='stack')
ggplot(data=lc, aes(x = Species, fill = Life_class))+
geom_bar(position ='stack')
ggplot(data=sc, aes(x = Species, fill = Stem_Category))+
geom_bar(position ='stack')
The idea was to do something like this:
ggplot()+
geom_bar(data= cc, aes(x = Species, fill = Crown_Class),
position='stack') +
geom_bar(data=lc, aes(x = Species, fill = Life_class),
position ='dodge')+
facet_wrap(~Species)
But the result is not what I have in mind. The second plot effectively overwrites the first.
I would be grateful for any help.
Here's an example of how you could use facet_grid
to include all 4 variables on the same plot.
Note that I generate some dummy data, since I had trouble importing your dataset into R
.
generate data
library(ggplot2)
theme_set(theme_bw())
set.seed(123)
df1 <- data.frame(s1 = sample(letters[1:3], 11, replace = T),
s2 = sample(letters[4:6], 11, replace = T),
s3 = sample(letters[7:9], 11, replace = T),
s4 = sample(letters[10:12], 11, replace = T),
stringsAsFactors = FALSE)
edit:
Maybe this is closer to what you're after:
ggplot(df1)+
geom_bar(aes(x = s1), position = 'stack')+
geom_bar(aes(x = s2), position = 'stack')+
geom_bar(aes(x = s3), position = 'stack')+
facet_wrap(~ s4)
If you proceed in this manner, you should definitely note that the values on the x-axis come from three different variables.
IMHO: While I'm no expert on the subject, I do think it's a bit dubious to create a visualization with three different variables on the same axis, and ggplot2
gives you plenty of options to avoid proceeding in such a manner.
make plot using facet_grid
ggplot(df1, aes(x = s1, fill = s2))+
geom_bar(position = 'stack')+
facet_grid(s3~s4)
make plot using interaction
and facet_wrap
Now, suppose you don't want the two grouping factors as facets, and just prefer one facet. Then, we can use the interaction
function.
ggplot(df1, aes(x = s1, fill = interaction(s2,s3)))+
geom_bar(position = 'stack')+
facet_wrap(~s4)
use Rmisc::multiplot
Finally, we can create three separate plots, and then use Rmisc::multiplot
to plot on the same page.
library(Rmisc)
p1 <- ggplot(df1, aes(x = s1, fill = s2))+
geom_bar(position = 'stack')
p2 <- ggplot(df1, aes(x = s1, fill = s3))+
geom_bar(position = 'stack')
p3 <- ggplot(df1, aes(x = s1, fill = s4))+
geom_bar(position = 'stack')
multiplot(p1,p2,p3, cols = 3)
Since you are trying to differentiate your plots using Crown_Class
, Life_class
, and Stem_Category
, ggplot2 would prefer those values to be in a column of their own (in general ggplot2 like long data, where only one column contains the value being plotted.) We can reorganize the data using tidyr.
library(tidyr)
df <-
gather(df, variable, value, -Species)
head(df)
Species variable value
1 E. obliqua Crown_Class Suppressed
2 E. rubida Crown_Class Intermediate
3 E. obliqua Crown_Class Suppressed
4 E. obliqua Crown_Class Suppressed
5 E. rubida Crown_Class Suppressed
6 E. radiata Crown_Class Suppressed
Now we can facet wrap on variable
ggplot(df) +
geom_bar(aes(x = Species, fill = value)) +
facet_wrap(~ variable)
If you dont like having only one guide for all the colors for Crown_Class
, Life_class
and 'Stem_Category', you can make three separate plots and combine them using the gridExtra
package.
library(dplyr)
library(gridExtra)
p <-
df %>%
filter(variable == 'Crown_Class') %>%
ggplot() +
geom_bar(aes(x = Species, fill = value)) +
facet_wrap(~ variable)
q <-
df %>%
filter(variable == 'Life_class') %>%
ggplot() +
geom_bar(aes(x = Species, fill = value)) +
facet_wrap(~ variable)
r <-
df %>%
filter(variable == 'Stem_Category') %>%
ggplot() +
geom_bar(aes(x = Species, fill = value)) +
facet_wrap(~ variable)
grid.arrange(p, q, r)