Split beeswarm 2

2019-07-30 08:39发布

问题:

This is a follow up on the last question: Split beeswarm plot

I ask it as a new question, because my first question was sufficiently answered. But with my real data, there was suddenly a weird behaviour that I don't understand.

With the previous data frame

my_dat <- data.frame(x = 'x', m = rep(c('a','b'),100), y = rnorm(200))

the suggested solution works nice. But when I boost up my data a bit (simply increasing the sample size!), the plot becomes weird:

my_dat <- data.frame(x = 'x', m = letters[1:2], y = sample(0:100, 2000, replace = T), stringsAsFactors = F)

require(ggplot2)
require(ggbeeswarm)
require(dplyr)

p <- ggplot(my_dat, aes(x,y,color=m))+  ## this is copy/paste from @Jimbou's great idea. 
  geom_quasirandom(method = 'smiley')

p <- ggplot_build(p)

p$data[[1]] <-   p$data[[1]] %>%
  mutate(x=case_when(
    colour=="#00BFC4" ~ PANEL + abs(PANEL - x),
    TRUE ~ PANEL - abs(PANEL - x))
  )
plot(ggplot_gtable(p))

Why does this happen with increased sample size??

回答1:

Ok, I hope this is worth a self-answer.

It has to do with the method to distribute the points. Changing this:

p <- ggplot(my_dat, aes(x,y,color=m))+  
  geom_quasirandom(method = 'pseudorandom') #instead of 'smiley'

p <- ggplot_build(p)

p$data[[1]] <-   p$data[[1]] %>%
  mutate(x=case_when(
    colour=="#00BFC4" ~ PANEL + abs(PANEL - x),
    TRUE ~ PANEL - abs(PANEL - x))
  )
plot(ggplot_gtable(p))