First, the way to specify the order of categorical variables for ggplot
is to reorder its levels in the data.frame
. Second, to have an area highlighted on any plot we can use geom_rect
. Here we need to note that it is a key thing not to pass the data to geom_rect
, otherwise it does not let us to set alpha
, thus the gridlines won't be visible. Now there are 2 cases:
- if we pass the data to
geom_rect
(or to the top levelggplot()
), the order agrees with that in thedata.frame
, but as I mentioned the rectangle won't be transparent - if we pass the data only to the
geom_point
layer, ggplot rearranges the discrete variables in alphabetical order
How to have both of the two criteria, i.e. have the predefined order and have a transparent rectangle in the desired position?
Bonus question: how to have a rectangle at discrete variables with its edges between gridlines, i.e. adjusted by 0.5
? vjust
and hjust
are not used arguments here (as a warning tells us). And how to make rectangle filling the whole vertical space (for this we would need to define ymax
as the n+1
th factor level, which does not exist).
require(ggplot2)
ex <- data.frame(a = factor(letters[1:10]),
b = factor(rep(c('b', 'a'), 5)),
c = rep(letters[1:5], 2))
ex$a <- factor(ex$a, levels = ex$a[order(ex$b)])
ggplot(
# uncomment this to see the other failure:
# ex, aes(y = a, x = c)
) +
geom_rect(
aes(
xmin = 'b',
xmax = 'd',
ymin = 'd',
ymax = 'j'
),
alpha = 0.2
) +
geom_point(
data = ex,
aes(
y = a,
x = c
)
)
The best I can do is to avoid the
data
issue by usingannotate
instead (a data free layer). You somehow need to putgeom_point
first though, but I'm not sure why. It seems like the scale gets determined by the first layer, even though I have supplied the data and mapping already inggplot
.