require(ggplot2)
require(cowplot)
d = iris
ggplot2::ggplot(d, aes(factor(0), Sepal.Length)) +
geom_violin(fill="black", alpha=0.2, draw_quantiles = c(0.25, 0.5, 0.75)
, colour = "red", size = 1.5) +
stat_boxplot(geom ='errorbar', width = 0.1)+
geom_boxplot(width = 0.2)+
facet_grid(. ~ Species, scales = "free_x") +
xlab("") +
ylab (expression(paste("Value"))) +
coord_cartesian(ylim = c(3.5,9.5)) +
scale_y_continuous(breaks = seq(4, 9, 1)) +
theme(axis.text.x=element_blank(),
axis.text.y = element_text(size = rel(1.5)),
axis.ticks.x = element_blank(),
strip.background=element_rect(fill="black"),
strip.text=element_text(color="white", face="bold"),
legend.position = "none") +
background_grid(major = "xy", minor = "none")
To my knowledge box ends in boxplots represent the 25% and 75% quantile, respectively, and the median = 50%. So they should be equal to the 0.25/0.5/0.75 quantiles which are drawn by geom_violin
in the draw_quantiles = c(0.25, 0.5, 0.75)
argument.
Median and 50% quantile fit. However, both 0.25 and 0.75 quantile do not fit the box ends of the boxplot (see figure, especially 'virginica' facet).
References:
This is too long for a comment, so I post it as an answer. I see two potential sources for the divergence. First, my understanding is that the
boxplot
refers toboxplot.stats
, which useshinges
that are very similar but not necessarily identical to the quantiles.?boxplot.stats
says:The
hinge vs quantile
distinction could thus be one source for the difference.Second,
geom_violin
refers to a density estimate. The source code here points to a functionStatYdensity
, which leads me to here. I could not find the functioncompute_density
, but I think (also due to some pointers in help files) it is essentiallydensity
, which by default uses a Gaussian kernel estimate to estimate the density. This may (or may not) explain the differences, butdo show indeed differing values. So, I would guess that the difference is due to whether we look at quantiles based on the empirical distribution function of the observations, or based on kernel density estimates, though I admit that I have not conclusively shown this.