This question already has an answer here:
I want to create a series of boxplots using ggplot2. A toy data-frame, OR.df, consists of 14 variables I wish to create boxplots for over factor prop.bdlf, grouped by method_type. I want to save them in a list so that I can edify them later, or print them out.
set.seed(24542)
OR.df <- data.frame(
matrix(rnorm(1400, 0, 1), ncol = 14, dimnames = list(NULL, paste0("Estimate.", 1:14))),
method_type = paste0("Method", 1:5), prop.bdlf = as.factor(c(0, 3, 5, 10))
)
#Start plotting ...
my.plot <- vector(mode = "list", length = 14)
for(j in 1:14){
title <- gsub("Estimate.", "", colnames(OR.df)[j])
cat("> Plotting...", paste0( "w = ", colnames(OR.df)[j]),
"with title", title, "\n")
p <- ggplot( OR.df, aes(y= OR.df[ , j], x = method_type , fill = prop.bdlf) ) +
ggtitle(paste(title, word[l], sep ="-")) +
geom_boxplot() +
labs( x = "Method Type", y = "weight" ) +
theme(legend.position = "right", legend.text=element_text(size = 11) ) +
guides(fill=guide_legend(title="BDL Prop"))
my.plot[[j]] <- p
} #end weight loop
multiplot( plotlist = my.plot[10:11])
The plots are the same! Why? The data is clearly different.
> summary(OR.df$Estimate.10)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.83717 -0.66358 -0.12748 -0.09981 0.36622 2.16782
> summary(OR.df$Estimate.11)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-2.86220 -0.67540 0.07784 0.08359 0.73410 2.82225
So, why is the same data being plotted in the loop? If I go inside the loop and set j = 11, the plot "p" matches exactly with the separate plot.
Thank you.
The reason is that in the
y
parameter passed is the column value instead of the column name, while thex
andfill
are correctly passed i.e.OR.df[,1], OR.df[,2], ..., OR.df[, 14] would be replaced with the column names as a symbol and evaluate (
!!
) it-testing (
multiplot
from here)-output