I am trying to plot a data frame with two variables by descending order. Both variables are factors. I want to consider the frequency of both variables when plotting, just like a pivot table in excel.
I tried to use tidy to group, count, and sort order of the variables by descending order.
library(tidyverse)
# Create a data frame that simulates the data that needs to be modeled
#Create data frame that will hold data for simulation
df1 = as.data.frame(replicate(2,
sample(c("A", "B", "C", "D", "E","F","G","H","I","J"),
50,
rep=TRUE)))
#Replace V2 column with System Nomenclature (Simulated)
df1$V2 <- sample(1:4, replace = TRUE, nrow(df1))
#Make V2 into a Factor
df1$V2 = as.factor(df1$V2)
#Create frequency table
df2 <- df1 %>%
group_by(V1, V2) %>%
summarise(counts = n()) %>%
ungroup() %>%
arrange(desc(counts))
#Plot the 2 variable data
ggplot(df2,
aes(reorder(x = V1, -counts) ,
y = counts,
fill = V2)) +
geom_bar(stat = "identity")
I expect to the graph to plot the data in descending order by the frequency of V1 but with the fill of V2. Just like the pivot table feature in excel. I also want to only display the Top-5 by frequency of V1 and fill with V2.
You can use fct_reorder and fct_rev to achieve what you want