I'd like to create a split violin density plot using ggplot, like the fourth example on this page of the seaborn documentation.
Here is some data:
set.seed(20160229)
my_data = data.frame(
y=c(rnorm(1000), rnorm(1000, 0.5), rnorm(1000, 1), rnorm(1000, 1.5)),
x=c(rep('a', 2000), rep('b', 2000)),
m=c(rep('i', 1000), rep('j', 2000), rep('i', 1000))
)
I can plot dodged violins like this:
library('ggplot2')
ggplot(my_data, aes(x, y, fill=m)) +
geom_violin()
But it's hard to visually compare the widths at different points in the side-by-side distributions. I haven't been able to find any examples of split violins in ggplot - is it possible?
I found a base R graphics solution but the function is quite long and I want to highlight distribution modes, which are easy to add as additional layers in ggplot but will be harder to do if I need to figure out how to edit that function.
Or, to avoid fiddling with the densities, you could extend
ggplot2
's GeomViolin like this:And use the new
geom_split_violin
like this:You can achieve this by calculating the densities yourself beforehand, and then plotting polygons. See below for a rough idea. It should be not too difficult to write this into a function.
Get densities
Flip and offset densities for the groups
Plot
Result